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SELF-REARRANGING DNA VECTORS 

Background of the Invention 
Hie invention relates to DNA vectors. 

10 Mammalian cell expression vectors based on DNA viruses have been widely 

discussed as gene delivery vehicles for genetic therapy. Among the different DNA 
viruses proposed for this purpose have been adenoviruses, baculovirus, Epstein Barr 
virus, aiid herpes simplex viras. In addition other smaUer virases that have an 
intranuclear phase in which the viral genome is present as a double stranded DNA, such 

15 as retroviruses and parvoviruses, have been proposed as gene delivery vehicles. 

Adenoviral vectors (AdV), for example, have a recognized potential for gene 
delivery, founded in their broad host range, robust growth in culture, and capacity to 
infect mitotically quiescent cells (Graham and Prevec, Manipulation of adenovirus 
vectors, p. 109-128, In E. J. Murray (ed.), Methods in Molecular Biology, vol. 7, 

20 Humana, Clifton, NJ, 1991; Trapnell and Gorzigha, Curr. Opin. Biotechnol. 5:617-625, 
1994). AdV can be propagated in a helper cell line, 293, a human embryonic kidney cell 
line transformed by adenovirus type 5 (Graham et al, J. Gen.Virol. 36:59-72, 1994). 
293 cells express the viral El gene products (Ela and Elb) that are the master regulatory 
proteins for subsequent viral gene expression. El deleted viruses can propagate in 293 

25 cells, but not in other cells. Although it would be expected that El deleted viruses lack 
the machinery to express viral genes, several studies have demonstrated that cellular El- 
like components can stimulate viral gene expression (Lnperiale et al., Mol. Cell. Biol. 
4:867-74, 1984; Onclercq et al., J. Virol. 62:4533-7,1988; Spergel et al., J. Virol. 
66:1021-30, 1992). Hie expression of these viral genes results in the relatively rapid 

30 elimination of transduced cells in vivo as a result of cytotoxic T cell responses (Yang et 
al., Immunity 1:433-42, 1994;. Yang et al., Gene Hier. 3:137-44, 1996; Yang et al., J. 
Virol. 69:2004-15, 1995). 

Thus attention has been focused on eliminating the remaining vestiges of viral 
expression. Viral genes that have been deleted for this purpose include the gene for E4 

35 proteins (Armentano et al., Hum. Gene Ther. 6:1343-53, 1995; Kochanek et al., Proc. 
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Natl. Acad. Sci. USA 93:5731-6, 1996; and Yeh et al., J. Virol. 70:559-565, 1996), 
DNA binding protein (Engelhardt et al., Proc. Natl. Acad. Sci. USA 21:6196-6200, 
1994; and Gorziglia et al., J. Virol. 70:4173-8, 1996), DNA polymerase (Amalfitano et 
al., J. Virol. 72:926-33, 1998), and the preterminal protein (Schaack et al., Proc. Nad. 
5 Acad. Sci. USA 93: 14686-91, 1996). The most aggressive approach has been the 

creation of helper virus-dependent vectors that lack all viral genes (Hardy et al., J. Virol. 
71:1842-9, 1997; Kochanek et al., Proc. Nad. Acad. Sci. USA 93:5731-6, 1996; Lieber 
et al., J. Virol. 70:8944-60, 1996; Mitani et al., Proc. Natl. Acad. Sci. USA 92:3854-8, 
1995; and Parks et al., Proc. Nad. Acad. Sci. USA 93: 13565-13570, 1996). These 
10 vectors have high capacity, evoke reduced cellular immune responses and show 

prolonged expression in vivo (Morsv et al. , Proc. Nafl. Acad. Sci. USA 95:7866-71, 
1998). However to deploy these viruses on the scale required for human clinical 
application presents major challenges because a cesium chloride (CsCl) gradient is 
needed to remove the helper virus. 

15 

Summary of the Invention 
In one aspect, the invention features a replicatable viral DNA vector encoding a 
site-specific DNA-altering enzyme and a DNA target recognized by said enzyme, said 
enzyme selectively converting, in a cell expressing said enzyme, said DNA vector to a 
20 rearranged form. 

La preferred embodiments, the rearranged form includes an autonomously 
replicating episome and a linear DNA product In other preferred embodiments, the 
vector comprises adenoviral DNA. 

In yet other preferred embodiments, the vector includes a genetically-engineered 
25 recombination site (such as a target of Cre or HLP). Preferably, such a recombination 
site includes a recognition sequence of a site-specific DNA altering enzyme. 

In another preferred embodiment, the site-specific DNA altering enzyme is a 
recombinase (such as Cre or FLP) or an integrate. Preferably, such an enzyme is 
functional in a mammalian cell. Preferred embodiments of the vector also include an 
30 origin of replication that functions in a mammalian cell (such as an Epstein Ban Virus 
replicon). Moreover, the vector typically includes a gene of interest (such as a 
therapeutic gene that encodes a protein or polypeptide or an RNA product). 
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In another aspect, the invention features a method for assembling a recombinant 
adenoviral DNA. The method, in general, includes the steps of: (a) providing a first 
linearized DNA vector comprising a restriction site and a cos site and a second 
linearized DNA vector comprising the restriction site, an adenoviral nucleic acid 
5 molecule, and a cos site; and (b) ligating the first and second linearized DNA vectors, 
the ligation assembling a recombinant adenoviral DNA. 

In preferred embodiments, the first linearized DNA vector comprises a selectable 
marker (such as a gene encoding a polypeptide that confers, on a host cell expressing 
such a polypeptide, resistance to an antibiotic). In other preferred embodiments, the first 
10 linearized DNA vector includes an adenoviral left-end inverted terminal repeat; a gene 
of ^interest: or.both. In still other preferr ed embodim ents, the s econd li nearize d DNA 
vector includes a selectable marker. Preferably, the second linearized DNA vector 
includes an adenoviral right-end inverted terminal repeat 

The method further includes packaging the assembled adenoviral DNA into a 
15 phage and infecting a host cell. Typically the first and second linearized DNAs include 
cosmid vector DNA. In addition, such adenoviral DNA is typically flanked by cleavage 
sites (such as intron endonuclease cleavage sites). 

In another aspect, the invention features an adenovirus producer cell having a 
nucleic acid molecule that expresses a dominant negative site-specific DNA-altering 
20 enzyme. In preferred embodiments, the site-specific DNA altering enzyme is a 

dominant negative recombinase (for example, a Cre recombinase such as CreY324C or a 
Flp recombinase). Exemplary adenovirus producer cells include, without limitation, 293 
human embryonic kidney cells, per.C6 cells, and N52 cells. 

In yet another aspect, the invention features a vector comprising, in the 5' to 3' 
25 direction, a first genetically engineered cw-acting target recognized by a site-specific 
DNA altering enzyme; a gene of interest; a lineage-specific gene promoter, a second 
genetically engineered cis~ acting target recognized by a site-specific DNA altering 
enzyme; and a nucleic acid molecule encoding a site-specific DNA altering enzyme. 
In still another aspect, the invention features a vector including, in the 5' to 3' 
30 direction, a first genetically engineered cw-acting target recognized by a site-specific 
DNA altering enzyme; a gene of interest; a bi-directional promoter, comprising a second 
genetically engineered cis-acting target recognized by a site-specific DNA altering 
enzyme; and a nucleic acid molecule encoding a site-specific DNA altering enzyme. 
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In related aspects, the invention features a method of gene therapy including the 
administration to a patient in need of gene therapy a therapeutically effective amount of 
the vector of the invention, which is expressed in the patient Hie invention further 
relates to a population of cells transfected with the vector of the invention. 

5 Accordingly, the invention further relates to the use of a recombinant viral vector 

or use of a recombinant viral particle for gene therapy. Such vectors and viral particles 
may be introduced either in vitro into a host cell removed from the patient, or directly in 
vivo, into the body to be treated, according to standard methods known in the art 
The invention also relates to a pharmaceutical composition that includes a 

10 therapeutically effective amount of a recombinant viral vector or viral particle prepared 
according.tojhe methods disclosed herein, in combination with a vehicle that is 
acceptable from a pharmaceutical standpoint Such a pharmaceutical composition may 
be prepared according to the techniques commonly employed and administered by any 
known administration route, for example systemically (in particular, by intravenous, 

15 intratracheal, intraperitoneal, intramuscular, subcutaneous, intratumoral, or intracranial 
routes) or by aerosolization or intrapulmonary administration. 

One skilled in the art will appreciate that suitable methods of administering a 
vector (particularly an adenoviral vector) of the present invention to an animal for 
purposes of gene therapy, chemotherapy, and vaccination are available, and, although 

20 more than one route can be used for administration, one particular route may provide a 
more immediate and more effective reaction than another. Pharmaceutically acceptable 
excipients also are well known to those who are skilled in the art, and are readily 
available. The choice of excipient will be determined, in part, by the particular method 
used to administer the recombinant vector or particle. Accordingly, there are a wide 

25 variety of suitable formulations for use in the context of the present invention. 

By "recombinant DNA vector" is meant a DNA sequence containing a desired 
sequence (such as a gene of interest) and an appropriate regulatory elements) necessary 
for the expression of the operably linked sequence in a particular host organism (such as 
a mammal). 

30 By "operably linked" is meant that a gene and a regulatory elements) are 

connected to permit gene expression when the appropriate molecules (for example, 
transcriptional activator proteins) are bound to the regulatory sequence(s). 
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By "regulatory element" is meant a genetic element that controls some aspect of 
the expression of a nucleic acid sequence. For example, a promoter is a regulatory 
element that facilitates the initiation of transcription of an operably linked coding region. 
Other genetic regulatory elements include, without limitation, splicing signals, 
5 polyadenylation signals, and termination signals. For example, transcriptional 

regulatory elements in eukaryotes include promoter and enhancer elements. Promoters 
and enhancers include arrays of DNA sequences that interact directly or indirectly with 
cellular proteins involved in transcription. Promoter and enhancer elements have been 
isolated from a variety of eukaryotic sources including genes in mammalian cells and 
10 viruses. 

By "transfection" is .meant thejntroductionof foreign DNA into eukaryotic cells. 
Transfection is typically accomplished by a variety of means known in the art including, 
without limitation, calcium phosphate-DNA co-precipitation, DEAE-dextran-mediated 
transfection, electroporation, microinjection, liposome fusion, lipofection, protoplast 

15 fusion, and biolistics. 

By "stably transfected" is meant the introduction of foreign DNA into the 
genome of the transfected cell. In general, transfer and expression of transgenes in 
mammalian cells are now routine practices to those skilled in the art, and have become 
major tools to carry out gene expression studies and to generate vectors useful in gene 

20 therapy. 

By "gene of interest" is meant a gene inserted into a vector whose expression is 
desired in a host cell. Genes of interest include, without limitation, genes having 
therapeutic value, as well as reporter genes. A variety of such genes are useful in the 
invention, including genes of interest encoding a protein, which provides a therapeutic 

25 function. In addition, the gene of interest, if a therapeutic gene, can render its effect at 
the level of RNA, for instance, by encoding an antisense message or ribozyme, a protein 
which affects splicing or 3' processing (e.g., polyadenylation), or it can encode a protein 
which acts by affecting the level of expression of another gene within the cell (i.e., 
where gene expression is broadly considered to include all steps from initiation of 

30 transcription through production of a processed protein), for example, by mediating an 
altered rate of mRNA accumulation, an alteration of mRNA transport, and/or a change 
in post-transcriptional regulation. 
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By "reporter gene" is meant a gene sequence that encodes a reporter molecule 
(including an enzyme). A "reporter molecule" is detectable in any detection system, 
including, but not limited to, enzyme (e.g., ELK A, as well as enzyme-based 
histochemical assays), fluorescent, radioactive, and luminescent systems. Exemplary 
5 reporter gene systems include the E. coli beta-galactosidase or glucuronidase genes, 
green fluorescent protein (GFP), blue fluorescent protein (BFP), the human placental 
alkaline phosphatase gene, the chloramphenicol acetyltransferase (CAT) gene; other 
reporter genes are known in the art and may be employed as desired. 

By "transgene" is meant any piece of DNA, which is inserted by artifice into a 
10 cell, and becomes part of the genome of the organism, which develops from that cell. 
Such a transgene may include a g ene that i s partly or entir ely heterolo gous (i.e., foreign) 
to the transgenic organism, or may represent a gene homologous to an endogenous gene 
of the organism. 

By "transgenic" is meant any cell that includes a DNA sequence, which is 
15 inserted by artifice into a cell and becomes part of the genome of the organism, which 
develops from that cell. 

By "polypeptide" is meant any chain of amino acids, regardless of length or post- 
translational modification (for example, glycosylation or phosphorylation). 

By "derived from" is meant isolated from or having the sequence of a naturally 
20 occurring sequence (e.g., a cDNA, genomic DNA, synthetic, or combination thereof). 

By "nucleic acid" is meant a polynucleotide (DNA or RNA)* 

By "gene" is meant any nucleic acid sequence coding for a protein or an RNA 
molecule. 

By "gene product" is meant either an untranslated RNA molecule transcribed 
25 from a given gene or coding sequence (such as, mRNA or antisense RNA) or the 

polypeptide chain translated from the mRNA molecule transcribed from the given gene 
or coding sequence. Nucleic acids according to the invention can be wholly or partially 
synthetically made, can comprise genomic or complementary DNA (cDNA) sequences, 
or can be provided in the form of either DNA or RNA. 
30 The presently claimed invention affords a number of advantages. For example, 

applicants' gene therapy vehicles particularly those based on recombinant adenoviruses, 
minimize the propensity of the vectors to activate host immune surveillance, and thereby 
maximize the persistence for the DNA transduced. The invention therefore facilitates 
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the development of gene delivery vectors designed to enhance persistence of virally 
delivered genes and evade the cellular immune response by severing the connection 
between the sole adenoviral enhancer and the sequences encoding potentially antigenic 
viral proteins. 

5 As described in more detail below, the mechanism by which this is accomplished 

differs significantly from any other previous approaches. For example, to reduce the 
immunogenicity of vectors it is widely acknowledged that some intervention, such as the 
removal of key genes, or the prevention of their expression in the cells targeted for 
therapy, is important; however, many related approaches are directed at the host and 

10 have generally focused on the selective induction of tolerance to adenoviral antigens, or 
similar strategies directed a t inducin g a temporally restricted or anti g en-specific 
compromise of the immune system. 

In addition, the poor persistence of transduced DNA appears to be due in part to 
immunological rejection of transduced cells and to the inability of the viral DNA to 

15 replicate, a feature generally inherent in the design of adenoviral vectors, but one which 
is not associated with applicants' claimed gene therapy vehicles. 

Moreover, some contemporary adenoviral vectors are designed to propagate in 
specific host cells which provide essential replication factors in trans. These vectors are 
typically based on cell lines which express the master regulatory proteins of the El 

20 complex, which are required for induction of adenoviral DNA replication. In cells 
expressing El genes, the best studied of which is a human embryonic kidney cell line 
transformed by DNA from human adenovirus 5 (called HEK293, or simply 293), viruses 
lacking El genes propagate well. Such viruses do not propagate on cell lines which do 
not express El, and do not generally propagate well in the target cells to which the 

25 therapeutic gene is to be delivered. Cells transduced with El -deleted adenovirus vectors 
also do not express high levels of viral genes in the absence of El. However, the weak 
residual expression that remains in such vectors appears to be sufficient to induce 
cellular immune responses that contribute to the destruction of the transduced cells. 

In addition, the gene therapy vectors claimed herein are hybrid vectors capable of 

30 self-rearrangement to form circular and linear DNA products. The linear DNA has a 
compromised ability to express adenoviral genes, and therefore has a lower 
immunological profile. And the circular DNA behaves like a mammalian plasmid, 
encoding the gene of interest and persisting by autonomous replication in the nucleus. 
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For example, the circularization of an adenoviral vector via the action of Cre 
recombinase beneficially places a gene of interest (for example, a therapeutic gene) on a 
self-replicating episome. Vector circularization occurs in a tissue-targeted manner, for 
example, as a result of the activation of a synthetic liver-specific promoter upstream of 
5 the recombinase Cre. Once circularized, the EBV replicon in the episome confers 
improved persistence on the therapeutic gene as detected by reporter gene expression 
and direct assay for the presence of vector DNA sequences. 

Furthermore, the invention eliminates the requirement for a helper virus, thus 
avoiding two potential limitations of that system. First, the continuous expression of Cre 

10 recombinase may lead to toxicity in host cells, either as a direct consequence of the 
protein's activity or via its immuno g enicit y. Second, die Cre helper virus may itself 
produce antigenic viral proteins that contribute to the immunologic elimination of 
infected host cells. In contrast, the self-resolving adenovirus/EBV vector system 
disclosed herein advantageously provides no alternative source of viral proteins, and Cre 

15 expression is terminated upon rearrangement. 

In addition, the invention described herein provides tools for analyzing the roles 
of the enhancer in viral gene regulation and virus growth. 

The invention also provides a convenient general system for creating 
recombinant adenoviruses, which increase their attractiveness as gene transduction tools 

20 for basic research. Hie system, for example, employs two conventional plasmid vectors 
and a X phage packaging step. The entire recombinant AdV genome is assembled into a 
single cosmid that is easily amplified in E .colu The use of intron endonuclease 
recognition sequences flanking the ITRs enhances virus production while simplifying 
insertion of therapeutic gene sequences into the pLEP shuttle plasmid. The convenience 

25 of this vector system has facilitated the construction of over two hundred recombinant 
viruses to date. 

Other embodiments and advantages of the invention will be apparent from the 
detailed description thereof, and from the claims. 

30 Brief Description of the Drawings 

FIGURE 1 A is a schematic diagram of the structure of an adenoviral type A 
vector and its fate in a target cell, enh refers to the Ad2 enhancer; GFP refers to the 
marker gene green fluorescent protein; EBV refers to the Epstein Barr Virus replicon; 



-8- 



WO 02/20814 



PCT/US01/27682 



Tet0 7 refers to a heptamer of Tet operator; TetR refers to the Tet repressor; VP16 refers 
to the viral protein 16 of Herpes simplex virus, SD refers to the splice donor site; and SA 
refers to the splice acceptor site. 

FIGURE IB is a schematic diagram of the structure of an adenoviral type B 
5 vector and its fate in a target cell, enh refers to the Ad2 enhancer; GFP refers to the 
marker gene green fluorescent protein; EB V refers to the Epstein Barr Virus replicon; 
SD refers to the splice donor site; and S A refers to the splice acceptor site. 

FIGURE 2A shows a schematic diagram of the pLEP cosmid polylinker region 
and its position relative to the adenoviral left ITR. The adenovirus enhancer/packaging 

10 sequence (\|/) is boxed. 

FIGURE 2B is a schematic diagram -showing the generation of a single .cosmid 
encoding the AdV genome by the direct ligation of two smaller plasmids. A gene 
expression unit, CMVGFP, was inserted into the pLEP cosmid at the polylinker region. 
pLEP and pREP cosmids were digested with an intron endonuclease (PI-PspI), ligated, 

15 and packaged in vitro to generate pAd2CMVGFP. This DNA was then digested with 
another intron endonuclease (I-Ceul) to expose the ITRs at both ends of the viral 
genome. Finally, cosmid digestion mixtures were transfected into 293 cells. Plaques 
generated by recombinant viruses are detected in 7-10 days. 

FIGURE 3A shows the restriction analysis of cosmids carrying the full length 

20 AdV DNA showing uniform generation of the desired vector DNA. 2 [ig DNA samples 
from four pAd2-7CMVGFP colonies were digested with Bgl n, resolved on a 1% 
agarose gel and stained with ethidium bromide. The predicted sizes of the DNA 
fragments are: 13261, 7684, 5228, 5088, 2284, 1757, 1549, 1270, 351, and 275 base 
pairs (bp). The 5228 and 5088 fragments appear as a doublet, and the 351 and 275 bp 

25 fragments are too small to be seen on the gel. 

FIGURE 3B shows the release of the recombinant Ad DNA from cosmids by I- 
Ceul digestion. 2 /ig of pAd2-7CMV DNA from two clones was digested with I-Ceul. 
Arrows indicated the position of the released recombinant AdV DNA and the vector 
fragments of approximately 35 kb and 5 kb, respectively. 

30 FIGURE 4A shows the appearance of plaques in 293 cells transfected with 10 fig 

of pIAdGFPB with no ITR exposed (undigested), one ITR exposed (BsaBI or I-CeuT), or 
both ITRs exposed (BsaBI plus I-Ceul). Values represent the mean plaque counts per 



-9- 



WO 02/20814 



PCT7US01/27682 



dish and the time required for plaque development in 293 cells from three separate 
experiments. "T designates I-Ceul; and "B" designates Bsa BL 

FIGURE 4B shows the viral titers obtained from plaques that were allowed to 
grow over 10 days after transfection. Viruses were harvested and the titer of each virus 
5 stock was determined by a GFP based semi-quantitative titration procedure described 
herein. Values represent the mean ± SE of three independent determinations. 

FIGURE 5 is a schematic diagram showing a linear AdV that resolves into a 
circular episome. The elements involved in the self-directed rearrangement of the vector 
are shown schematically in pLEPlBHCRGFP/EBV and in the corresponding AdV. 
10 Starting from the left ITR, the elements are shown as following sequence: left ITR, 147 
bpr*^t^4bp-ioxP site; 185^ 

from EFla gene first intron; 720 bp GFP cDNA; 230 bp SV40 poly(A); 1.7 kb TK- 
EBNA-l/OriP; 970 bp HCR12 promoter; 1 kb EFla gene first intron containing splicing 
donor (SD) and acceptor (S A) sites with the second loxP site inserted at 64 bp upstream 

15 of the 3'end; 1.2 kb Cre gene tagged with AU1 and a nuclear localization signal; ~120bp 
poly(A) signal and PI-PspI site. After infection of liver cells, the HCR12 promoter 
drives the expression of Cre which results in the cleavage of the two loxP sites. This 
results in the circularization of the fragment containing the EB V replicon. The excision 
severs the connection between the enhancer/packaging signals and the remainder of the 

20 AdV genome. The Cre gene becomes promoterless and is left on the AdV genome 
fragment. After excision, the HCR12 promoter drives the expression of the GFP 
reporter gene. The EBV replicon maintains the excised circle as an episome in host 
cells. 

FIGURE 6A is a schematic representation of the loxP sites and EBNA-1 
25 locations in the AdV genome. The relevant Bgl II site is also shown. 

FIGURE 6B shows the time course of rearrangement in HepG2 and Hela cells at 
an equal multiplicity of infection (moi) of 1,000 particles per cell. Cells were infected 
with Ad2HCRGFP/EBV viruses for 2 hours at 37 °C. Hirt DNA samples were extracted 
from the cells. -5 jig of Hirt DNA samples were digested with Bgl n, fractionated on a 

32 

30 1% agarose gel, and analyzed by Southern blot techniques using a P-labeled EBNA-1 
fragment as the hybridization probe. 
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FIGURE 6C shows the DNA blot results obtained from Hela cells infected at a 
moi of 10,000; and HepG2 at 1,000. The upper bands (4915 bp) represent the 
circularized DNA fragments whereas the lower bands (3162 bp) represent the non- 
circularized AdV. 

5 FIGURE 7A shows green fluorescent protein (GFP) expression in liver and non- 

liver cells infected with the Ad2HCRGFP/EBV viruses. Cells were cultured in 35 mm 
dishes and infected with the Ad2HCRGFP/EBV virus at desired moi. HepG2 cells ware 
infected with 1,000 particles per cell, whereas Hela cells were infected with a moi of 
10,000. GFP expression was examined at the indicated time points after infection. 

10 Fluorescent cells were photographed using an Olympus SC35mm camera mounted on an 
Olympic TXTO fluorescent microscope, at 200x magn ification , usin g a filter with peak 
excitation and emission wavelengths of 450 nm and 510 nm, respectively. 

FIGURE 7B shows the expression of GFP in HepG2, Hela, A431, and HT29 
cells. Cells were seeded in 35 mm dishes and infected with the Ad2HCRGFP/EBV 

15 virus at a moi of 10,000 particles per cell. GFP expression was examined at 72 hours 
after infection. 

FIGURE 7C shows the expression of GFP in human primary hepatocytes. These 
cells were photographed under bright field (left) and fluorescent conditions (right). 
FIGURE 8 A shows the results of RT-PCR that was performed to detect the 

20 tripartite leader sequence (upper panel) for virus late gene expression; and PCR was 
performed in the DNA samples for detection of the AdV genomes. Hie specific target 
sequences are described in detail infra. PCR analyses of adenovirus late gene 
expression in cells infected with the first generation AdVs or the self -resolving 
Ad2HCRGFP/EBV was analyzed. HepG2 cells were cultured in 35 mm dishes and 

25 infected with increasing moi (0, 10, 100, 1000, 10,000, and 100,000) of adenoviral 
vectors. RNA and DNA were isolated in parallel from the cells at 72 hours after 
infection. 

FIGURE 8B shows a summary of quantitative RT-PCR and PCR results. Each 
determinant was the average of three experiments. 
30 FIGURE 9A is a schematic diagram depicting the deletion analysis of the OriP 

and EBNA-1 regions of the EBV replicon. Structures of the deletions in EBNA-1 and 
OriP are schematically represented. Elements considered important for episomal 
maintenance are indicated. FR refers to the family of repeats; DS designates the region 
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of dyad symmetry; LR1 refers to the so-called linker region 1; GA refers to gly-ala 
repeats; LR2 refers to linker region 2; and Dimerization designates the dimerization 
domain. 

FIGURE 9B is a graph depicting fractions of GFP positive cells carrying the 
5 EB V replicons represented in Figure 9A. 

FIGURE 10A shows the positions and identities of Cre mutants tested for their 
dominant negative Cre activities. 

FIGURE 1 OB is a schematic diagram of the substrate Cre plasmid (ad2239) used 
to test dominant negative functions of Cre mutants. 
10 FIGURE 10C shows GFP expression in cells cotransfected with the substrate Cre 

plasmid (ad2239) and the indicated Cre mutants. 

FIGURES 10D and 10E show Cre mutants tested for their ability to inhibit 
rearrangement. Only those showing the strongest inhibitory activities were retested in 
Fig. 10E. GFP intensity was normalized to that of cells in the absence of inhibition. 
15 FIGURE 1 1 shows GFP expression in 293 TetON cells and #17 cells transfected 

with ad2239. The ability of #17 cells to inhibit Cre activity is demonstrated by the weak • 
GFP signal in cells treated with 2 |aM doxycycline. 

FIGURE 12 is a schematic diagram depicting the tetracycline mediated auto- 
regulatory circuit 

20 FIGURES 13A and 13B show the effects of different basal elements on synthetic 

TetO promoter activity. FIGURE 13 A shows a schematic diagram of the components of 
various auto-regulatory synthetic TetO promoters. FIGURE 13B shows a comparison of 
the strength of auto-regulatory synthetic TetO promoters bearing different basal 
elements, in the presence and absence of tetracycline, using GFP as a marker in HepG2 

25 cells. 

FIGURE 14 shows the structure of a Cre substrate plasmid (ad2265). The 
promoter, Efla, and the gene, BFP, are interrupted by two loxP sites, which can be 
joined by Cre-mediated recombination. PA stands for poly A; BFP for blue fluorescent 
protein. 

30 FIGURES 15A and 15B show the estrogen regulation of Cre recombinase 

activity. 293 cells infected with type B virus, AD121.5, in which the Cre enzyme is 
fused with estrogen ligand binding domain at both the N- and C-termini were cultured in 
the presence or absence of 1 nM estrogen. Cre-mediated rearrangement in the presence 
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of estrogen is shown in Figure 15 A, whereas blot analysis of extrachromosomal DNA 

from the same cells is shown in Figure 15B. L represents the position corresponding to 

the unrearranged adenoviral DNA; and C represents the position corresponding to the 

circular form of DNA. 
5 FIGURE 16 shows the rearrangement of adenoviral sequences in vivo. 

Extrachromosomal DNA from the livers of Rag-2 mice sacrificed 2.5 hrs post injection 
. of type A adenovirus, AD102.7, was analyzed by DNA blot L represents the size 

corresponding to linear adenoviral DNA; and C represents the size corresponding to 

rearranged circular DNA. 
10 FIGURE 17 is a photomicrograph depicting high level GFP expression in Rag2 

mouse hepatic tissues 48 hrs post type A adeno virus (AD 102.7) injection. 

FIGURE 18A and 18B show schematic diagrams of the structures of adenoviral 

vectors and their fates in target cells, enh refers to Ad2 enhancer; GFP refers to green 

fluorescent protein; EB V refers to Epstein Barr Virus replicon; TetO; refers to heptamer 
15 of Tet operator; TetR refers to Tet repressor; VP16 refers to transcriptional activator 

domain from HSV protein 16; SD refers to splice donor site; and SA refers to splice 

acceptor site. 

FIGURE 19A shows the structure of a FLP substrate plasmid, ad2879. The 
promoter, Eflot, and the gene, GFP, are interrupted by 2 FRT sites, which can be joined 
20 by the FLP-mediated recombination. PA stands for poly A; BFP for blue fluorescent 
protein. 

FIGURE 19B shows the structure of a ere substrate plasmid, ad2204. 

FIGURE 20 shows the structures of several FLPe anti-sense plasmids. 

FIGURE 21 is a panel of photomicrographs showing inhibition of FLP enzyme 
25 activity by anti-sense FLP. 293 cells were transfected with FLP substrate (Figure 12) 

and plasmids indicated in each photo. High GFP intensity indicate the higher expression 
of FLP and less inhibition by the anti-sense expressed. 

FIGURE 22 shows a schematic diagram of FRT/Cre and loxP/FLP adenovirus. 



30 
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Detailed Description 
Described herein are systems for the regulated self-rearrangement of DNA 
vectors, for example, gene therapy vectors. Such regulated self-rearrangement has the 
potential to prevent unwanted expression of vector genes not required for a therapeutic 
5 effect, and to allow the stable association of the therapeutic gene with the target cell. 

The essential elements of the regulated DNA rearrangement system are a gene 
which encodes one or more proteins which induce DNA rearrangement, a method for 
regulating the activity of those proteins or their abundance, and a target DNA sequence 
on which those proteins act. Particularly desirable are methods for regulating the 

10 activity of the proteins or their abundance which can be easily carried out on an intact 
organism, such as administration or_withdrawal of a^ug, Jioimone, or enviro nmen tal 
stimulus such as heat or irradiation, which induces the activity or abundance of the 
proteins which cause DNA rearrangement. 

Especially desirable are regulated DNA rearrangement systems in which all of 

15 the components can be delivered in a single vector. An example of this is a virus which 
bears both the cis-acting sequences for DNA rearrangement as well as the protein or 
proteins which act on those sequences, and the regulatory apparatus which controls the 
activity or abundance of those proteins. However, it is not necessary that the different 
elements be encoded in a single nucleic acid. 

20 The important elements of this strategy are: the compromise of vector gene 

function by regulated rearrangement of DNA topology, the generation of plasmid circles 
from vector DNA in a regulated manner, and the removal of enhancer or promoter 
elements from the vector DNA by regulated excision. It is also important that the 
circular DNA generated by site-specific recombination possesses a mechanism for stable 

25 association with the host genome in some form, here conferred by the EBV replicon. In 
other embodiments, the circular DNA might possess the ability to direct its integration 
into the host chromosomes by a site-specific integration. Site-specific integration into 
the host chromosomes may also be generated by the action of a regulated site-specific 
recombinase on a linear template without passing through a circular intermediate. 

30 Also described herein is one particular self-rearranging vector that begins as a 

hybrid adenovirus vector which is capable of converting itself into two unlinked 
molecules, a circular and a linear DNA. After this event the linear DNA product is 
deleted for two important cis-acting sequences: the packaging signals, which are 
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required for insertion of the viral DNA into the viral capsid, and the enhancer, which 
increases the expression of other promoters encoded in the viral DNA. Hie remaining 
linear DNA is thereby compromised in its ability to express adenoviral genes, endowing 
the vector with a lower immunological profile. The circular DNA generated by the 
5 excision event is a mammalian cell plasmid which has the capacity to persist by 
autonomous replication in the nucleus. This capacity is encoded in genetic elements 
derived from the Epstein Barr virus (EBV). A schematic diagram of such a vector is 
illustrated in Figure 5 . 

Epstein Barr virus is a human herpes virus which is the etiologic agent of 

10 infectious mononucleosis and which has been implicated in the genesis of Burkitt's 

lym phoma , a B cell neo plasm , and is t hou ght to be a predisposing factor for some forms 
of nasopharyngeal carcinoma. Approximately 85% of the adult Western population has 
a persistent population of B cells which contain a circular latent form of the viral- 
genome, maintained in cells by the action of Epstein Barr Nuclear Antigen 1 (EBNAl), 

15 a DNA replication protein that acts on the viral latent phase origin of replication, OriP. 
EBNAl in and of itself is not thought to promote neoplasia; current thinking places 
greater weight on the actions of the EBNA2 proteins and IMP, latent membrane protein, 
in the inception of EBV-associated neoplasm. 

Mammalian cell plasmids have been created which bear the EBNAl gene and 

20 OriP. In nonrodent cells, these plasmids persist by replication with each transit of the 
cell cycle. Multiple transcription units can be borne by these plasmids, allowing 
regulated expression of diverse gene products. 

Preferred adenoviral vectors, shown in Figures 1 A and IB, are linear forms of an 
EBV plasmid flanked by loxP sites, cis-acting sequences required for site-specific 

25 recombination directed by the bacteriophage PI ere protein. To prepare an adenovirus 
bearing both the ere protein and loxP sites, it is necessary to insure that the ere protein is 
not expressed while the vector is being propagated in 293 cells. To lower the 
immunological profile of the vector, it is also desirable that the ere protein not be 
expressed after the vector delivered its payload to the target cell and the ere protein 

30 performed its function. 

To accomplish these objectives, two general approaches have been developed for 
the production of adenoviral chromosomes that circularize following the regulated 
expression of site-specific recombinases. In each case, the vector is engineered to allow 
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for the production of viruses in 293 cells, and to provide transitory expression of 
recombinase that induces rearrangement in target tissues. The major difference between 
the two strategies lies in the way the deinduction of recombinase is achieved. 

In the first approach, adenoviral vectors are engineered to turn an activating 
5 transcription factor into a repressor upon chromosomal rearrangement Vectors 
employing this approach are referred to herein as type A vectors (Figure 1A). In the 
second approach, the recombinase promoter is redirected following chromosomal 
rearrangement Vectors utilizing the second approach are referred to as type B vectors 
(Figure IB). La both cases a linear chromosome is converted to its circular episomal 

10 form and a resulting deleted linear form. The circular DNA contains an Epstein Barr 

virus (EBV) re plicon , which allows synchronous replication of the e pisome with the host 
mitotic cycle (Reisman et al., Mol. Cell Biol. 8: 1822-32, 1985; Yates et at, Nature 313: 
812-15, 1985). The linear DNA is deleted for the enhancer and El genes. 

One self-regulated gene switch, employing the type A vector strategy, was 

15 designed based on the bacterial transposon TnlO tetracycline repressor (tetR) gene. In 
its natural context, the tetR protein binds to specific sequences (tet operator sequences) 
upstream of a tetracycline resistance gene, preventing transcription of the gene unless 
tetracycline is present. To adapt this protein for eukaryotic gene regulation, a gene 
fusion is created between tetR and an active portion of a strong eukaryotic 

20 transcriptional activator, the herpes simplex virus VP16 protein. The fusion protein 

exerts its action on a synthetic promoter created by the insertion of multiple tet operator 
sequences upstream of a basal promoter element. This configuration allows high-level 
gene expression whenever the tetR-VP16 fusion protein binds to its cognate operator 
sequences. Because the tetR protein normally does not bind to its operator in the 

25 presence of tetracycline, the activity of this synthetic promoter is high in the absence of 
tetracycline and low in its presence. 

One example of a type A vector is shown in Figure 1 A. This self-regulated gene 
expression cassette, present in a hybrid adenovirus, consists of a bi-directional promoter 
element in which central tetR binding sites are flanked by divergently oriented basal 

30 promoter elements. In one direction the promoter directs the formation of a transcript 
encoding the ere protein; in the other direction, the promoter directs the formation of a 
tetR-VP16 fusion protein. The latter differs from the conventional version in bearing a 
loxP site between the tetR component and the VP16 component When tetracycline is 
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present this gene switch is silent. As shown in Figure 1 A, upon introduction into a 
target cell in the absence of tetracycline, the tetR-loxP-VP16 fusion protein is produced, 
stimulating further production of the fusion protein, and the ere protein. The ere protein 
then acts to promote site specific recombination between the loxP site in the tetR-loxP- 
5 VP16 coding sequences, and a distant loxP site. As a result of this recombination, the 
fusion protein coding sequence is disrupted so that the promoter no longer directs the 
formation of a tetR-loxP-VP16 fusion protein, but gives rise to an inert tetR-loxP-VP16 
fusion protein for binding to the promoter upstream elements, thereby extinguishing 
promoter activity. 

10 As shown in Figure 1A, the excised circular DNA element contains at least two 

transcription .units. Ih.addition, other t ranscri ption units or internal ribosome entr y site 
elements may be used to allow the coexpression of gene products which are useful for 
extending the persistence of the delivered DNA, regulating expression of the gene of 
interest, or providing for ablation of the transduced cells once their presence is no longer 

15 desirable. In addition, the linear DNA remaining after excision of the circular gene 
expression plasmid lacks both viral packaging sequences and the cis-acting enhancer. 
Within this linear DNA, additional loxP sites may be placed to provide for the 
rearrangement of the remaining vector DNA in the target cell, disrupting the normal 
topology of the genes, and further thwarting expression. 

20 Using the type B vector design strategy, described in greater detail below, a 

recombinant adenoviral gene delivery system that is capable of undergoing growth 
phase-dependent site-specific recombination has also been constructed. 

Hie following examples are presented for the purpose of illustrating, not 
limiting, the invention. 

25 

TYPE B VECTORS - EXPERIMENTAL RESULTS 
Several experimental examples for constructing type B vectors and for carrying 
out the general approaches of the invention are now described below. 

30 Two-Cosmid System for Efficient Construction of Recombinant AdV 

To simplify and facilitate the generation of recombinant AdV, a system was 
established to assemble the desired AdV genome in a single plasmid by ligation (shown 
in Figures 2A, 2B). The system consists of two component vectors, a left end plasmid, 
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pLEP, and a right end plasmid, pREP. The left end Ad sequences (nt 1-376) in pLEP 
include the viral inverted terminal repeat, the cis-acting packaging sequences, and the 
viral enhancer. Nucleotide (nt) positions described herein refer to the wild type Ad2 
sequence in GenBank (J019017). The Ad sequences are followed by the gene 
5 expression unit intended for delivery and an intron endonuclease (PI-Pspl) cleavage site. 
The right end plasmid contains a PI-Pspl site followed by the Ad2 genome from the end 
of the El locus rightward (nt 3527-35937). 

pLEP is a small tractable vector for cloning, whereas pREP is much larger and 
contains less frequently manipulated genes. Both pLEP and pREP contain a 

10 bacteriophage X cos site, oriented to generate a single cosmid of appropriate length for in 
-vitro nackasins-followins-liffatinn of the.rwn.nl asmids at the PI-PsdI cleavage site. 
pLEP is tetracycline resistant (Tet r ) and pREP is ampicillin (Amp r ) resistant, allowing 
the recombinants to be selectively isolated by co-selection for both markers. In the 
resulting assembled cosmid, the adenoviral sequences are closely flanked by cleavage 

15 sites for the intron endonuclease I-Ceul. Digestion with I-Ceul liberates the entire 
recombinant AdV genome from the parent cosmid (see Figure 2B). 

Three classes of pREP have been constructed to allow the preparation of 
AdVs bearing El (pREP7; SEQ ID NO.: 2), El and E3 (pREP8; SEQ ID NO.: 3), or El, 
E3, and E4 (pREP12; SEQ ID NO.: 4) deletions. pREP7 (SEQ ID NO.: 2) contains nt 

20 3527-35937 of the Ad2 genome, and pREP8 (SEQ ID NO.: 3) carries an additional 
deletion in the E3 region (A nt 27901-30841). pREP12 (SEQ ID NO.: 4) has deleted 
open reading frames (ORE) 1-4 of the E4 region (A nt 34121-35469, 1348 bp). AdV 
generated with these cosmids should be able to accommodate 5, 8, and 10 kb inserts, 
respectively. 

25 These aforementioned vectors were constructed as follows. The EcoRI to Bsal 

fragment that spans the ampicillin resistance gene in pBR322 was deleted and replaced 
by a synthetic adapter, and the bacteriophage X cos site was inserted between the unique 
Styl and BsmI sites. A PCR amplified Ad2 fragment containing the left end ITR 
(L.ITR), enhancer elements, and the encapsidarion signal (nt 1-376) was created and 

30 inserted into the adapter (Figures 2A, 2B) to yield the tetracycline-resistant left-end 

plasmid pLEP. The right end of Ad2 from the AflB site to the right end (nt 3527-35937) 
was assembled into an ampicillin resistant cosmid vector, pACKrr3 (SEQ ID NO.: 1), by 
multiple steps of PCR amplification and fragment interchange. The resultant cosmid 
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was termed pREP7 (SEQ ID NO.: 2). To expand vector capacity, two deletions were 
incorporated into the pREP7 (SEQ ID NO.: 2) cosmid, an E3 gene deletion (nt 27901- 
30841, 2840 bp); cosmid pREP8 (SEQ ID NO.: 3) and a 1.3 kb deletion (nt 34121- 
35469) in the E4 region of the Ad2 region; pREP12 (SEQ ID NO.: 4) 
5 An example of the construction of an AdV carrying a CMV-GFP expression unit 

is outlined in Figure 2. pLEPCMVGFP (Tet r ) was digested with PI-PspI and ligated to 
the pREP7 (SEQ ID NO.: 2; AE1, Amp r ) digested with the same enzyme. The ligation 
mixture was packaged with X phage extracts (MaxPlax lambda packaging extracts, 
Epicentre Technologies) and a fraction of the packaged phage was used to infect a 

10 recombination-deficient E. colt host, with selection for the assembled plasmid on 

Amp/Tet plates . TransduciariiS'COutairimg pLEP fused" to pREP were selected on agar 
containing 25 /xg/ml ampicillin and 12.5 /xg/ml tetracycline (Amp/Tet). Colonies were 
selected and DNA isolated (Qiagen). DNA was used either for restriction analysis or for 
tranfection of 293 cells as described herein. 

15 Figure 3 A shows typical results for the Bgl II digestion pattern of a 

pLEP3CMVGFP/pREP7 hybrid cosmid, pAd2-7CMVGFP DNA. Because of the size 
minimum (-40 kbp) for X phage in vitro packaging and the double antibiotic selection, 
most of the colonies growing on Amp/Tet plates were the desired hybrid cosmids, and 
undesired rearrangements were rarely seen. In the present example, all four pAd2- 

20 7CMVGFP clones exhibited the digestion pattern predicted from the inferred sequence. 
The entire recombinant AdV genome was then released from the cosmid by I-Ceul 
digestion (Figure 3B). I-Ceul digestion leaves ten nucleotides to the left of the left TTR 
and eight nucleotides to the right of the right ITR. Short flanking sequences have been 
reported to be eliminated during replication of recombinant viruses after transfecting the 

25 DNA into 293 (human embryonic kidney) cells (Hanahan et al., Mol. Cell. Biol. 4:302- 
309, 1984). 

Hie digestion reaction can be transfected into 293 cells without purification as 
follows. 293 cells, obtained from Microbix Bisosystems (Ontario, Canada), were 
cultured in 10 cm dishes in complete Dulbecco's Modified Eagle's Medium (DMEM) 
30 supplemented with 10% FBS, 2mM glutamine and penicillin/streptomycin (Gibco BRL), 
and maintained at 37 °C and 5% C0 2 atmosphere incubator. Cells were grown to -50% 
confluence on the day of transfection. Ten fig of cosmid DNA were digested with I- 
Ceul in a volume of 50 The reaction mixture was transfected into 293 cells by 
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calcium phosphate precipitation (Graham and Prevec, Manipulation of adenovirus 
vectors, p. 109-128, In E. J. Murray (ed.), Methods in Molecular Biology, vol. 7, 
Humana, Clifton, NJ, 1991) without purification. After transfection, cells were cultured 
and examined daily for the appearance of cytopathic effects (CPE). Virus propagation, 
5 purification, plaque assay, and viral DNA isolation were performed using established 
protocols (Graham and Prevec, supra). At day six post-transfection, 5-30 viral 
plaques/10 cm dish /10 fig DNA were usually apparent, which compared favorably with 
the 30-50 plaques/10 cm dish /10 fig DNA found for 293 cells transfected with purified 
wild type Ad2 DNA. 

10 To compare the efficiency of recombinant virus production, similar viruses were 

also g enerated by homol ogous recombination. 20 jig of pREP7 (SEQ ID NO.: 2) was 
co-transfected into 293 cells with 10 fig of a plasmid encoding the left end of the 
adenoviral genome and a green fluorescent reporter gene (pLTTREFlaGFP). 
pLTTREFl aGFP contained the Ad2 left end nt 1-376, anEFla promoter/GFP 

15 expression unit and Ad2 sequence (from 3525-8120) that overlaps with the same 

sequence in pREP7 (SEQ ID NO.; 2). This overlap fragment served as the region for 
homologous recombination. Each co-transfection was performed in duplicate. Initial 
plaques took longer to appear (14 days post transfection) and were less abundant (0-3 
plaques per plate). 

20 Data in the literature suggest that exposed TTR ends favor efficient virus 

production (Hanahan et al., supra). To assess the importance of this effect, an AdV 
cosmid, pIAdEFlaGFPB, in which the AdV ITRs were flanked with a different 
restriction site at each end was constructed. pIAdEFlaGFPB DNA was digested with 
BsaBI to expose the right TTR, I-Ceul to expose the left ITR, or the two enzymes were 

25 used together to expose both ends. Digested cosmid DNA samples were transfected into 
293 cells and plaques were allowed to develop. Virus propagation, purification, plaque 
assay, and viral DNA isolation were performed using established protocols described in 
Graham and Prevec. (Manipulation of adenovirus vectors, In R J. Murray (ed.), 
Methods in Molecular Biology, vol. 7. Humana, Clifton, NJ., pp. 109-128, 1991). 

30 Ten days after transfection the viruses were harvested and viral titers were 

determined. The average titer for the viral stocks (Figures 4A, 4B) was 1.3 x 10 4 pfu/ml 
from transfection with undigested DNA; 2.4 x 10 5 pfu/ml from BsaBI linearized DNA 
(free right ITR); 1.1 x 10 s pfu/ml from I-Ceul linearized DNA (free left ITR); and 2.7 x 
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10 6 pfu/ml for the BsaBLT-CeuI double digested DNA (both ITRs free). Thus liberation 
of each end resulted in an approximate increase in the efficiency of generating virus by a 
factor of ten (Figures 4A, 4B). 

5 Construction of an AdV Capable of Self-Rearrangement 

One approach to attenuating adenoviral gene expression and improving transgene 
persistence is the creation of viruses capable of undergoing internal, self-directed 
rearrangement upon delivery to the target tissue. In principle, this objective can be 
achieved through the regulated expression of site-specific recombinases in vectors that 

10 contain the cis-acting target of recombinase action. To allow such vectors to be created, 
the recombinas e activity must be sup presse d durin g propag ation in the packa ging cell 
line. As described in more detail below, the use of a lineage-specific promoter to control 
recombinase expression has been successfully employed to achieve this end. 

An example of this is shown in Figure 5. The expression of Cre recombinase 

15 was controlled by a liver-specific promoter constructed as follows. The human hepatic 
control region 1 and 2 (HCR1 and 2) of the ApoE/C gene locus (Allan et aL, J. Biol. 
Chem. 270:26278-81, 1995; and Dang et al., J. Biol. Chem. 270:22577-85, 1995) were 
amplified by PCR using 293 cell genomic DNA as the template. Hie following primers 
were used to amplify both HCR1 and HCR2 fragment: HCRtop- 

20 5'gcggaattcggcttggtgacttagagaacagag 3' (SEQ ID NO.:5); HCRbot - 5' 

gcgggatccttgaacccggaccctctcacacta 3' (SEQ ID NO.:6). The amplified PCR fragments 
(-0.39 kb) were cloned into pUC19. The HCR1 and HCR2 sequences were confirmed 
by dideoxy DNA sequencing. The two fragments were assembled in a head to tail 
orientation, fused with a synthetic basal TATA element and cloned in a parental pLEP 

25 vector containing a GFP reporter gene. The resultant plasmid was named 

pLEPHCR 1 2GFP. The synthetic liver-specific, as demonstrated below, provided a 
means to control Cre recombinase expression during propagation of the vector in 293 
cells, and allowed for testing the consequences of abstracting the enhancer from the 
linear vector DNA upon delivery of the DNA to the target cells. 

30 In 293 cells, this promoter is silent, allowing the viral chromosome to be 

propagated with minimal rearrangement Any rearranged viruses that are formed lack 
packaging signals and so disappear from the pool of propagating vectors. In liver cells 
the Cre recombinase is induced by the action of the tissue-specific promoter. The 
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resulting Cre-induced recombination excises a circular episome and redirects the 
transcriptional output of the liver-specific promoter so that it directs the synthesis of the 
transgene of interest. The remaining linear fragment consists of an adenoviral genome 
lacking the enhancer and packaging signals and a Cre expression unit devoid of 
5 promoter sequences. 

In the form discussed here, one loxP site is located at nucleotide 147 of the Ad2 
genome, between the left UR and the enhancer/packaging sequences, and the second 
loxP site is placed inside an intron a few bases upstream of the splice acceptor sequence. 
Hence the loxP site does not appear in the resulting mature transcript. The Cre coding 

10 sequence that remains on the right end linear fragment after rearrangement lies 
downstream from a splice acceptor that lacks a splice donor or upstream promoter 
sequences. This effectively terminates the expression of Cre following excision. 

Prior to recombination, the Cre recombinase gene is under the control of a 
synthetic promoter (referred to as HCR12), consisting of hepatic locus control elements 

15 from the human ApoE/C locus fused to the first intron of the human EFla gene. After 
cyclization the HCR12 promoter lies upstream of the transgene (in this case GFP) and 
the distal segment of the intron (beyond the loxP site) contains the adenoviral enhancer. 
To facilitate manipulation of the plasmids in R coli, the human IgGl hinge-CH2 intron 
(118 bp) was inserted in the Cre coding sequence at nucleotide 237, suppressing Cre 

20 expression in bacteria. The circularized episome contains the latent origin of replication 
(OriP) and trans-acting DNA replication protein (EBNA-1) of Epstein Ban* virus, and 
hence is capable of autonomous replication in synchrony with the host mitotic cycle 
(Yates et al., Nature 313:812-815, 1985). 

Using the two cosmid system described above, the pLEP plasmid containing the 

25 self-resolving components, pLEPlBHCR12, was ligated with pREP8 (SEQ ID NO.: 3; 
AE1AE3) to create pAdVHCRGFP/EBV. The latter was digested with I-Ceul and 
transfected into 293 cells. Appearance of plaques from AdVHCRGFP/EB V was 
retarded (by 8 days) compared to non-rearranging viruses, perhaps as a result of basal 
expression of the liver-specific promoter in 293 cells. However high titer viral stocks of 

30 10 12 nominal (absorbance-determined) particles/ml was achieved. 
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Rearrangement in Target and Nontarget Cells 

To test excision efficiency, HepG2 (hepatocellular carcinoma) and Hela (cervical 
carcinoma) cells, obtained from ATCC, were infected with virus at a multiplicity of 
infection (moi) of 1,000 nominal particles/cell. This titer corresponds to approximately 
5 10 plaque forming units per cell. For these experiments, HepG2 and Hela cells were 
seeded in 35mm dishes and cultured to approximately 80% confluence in DMEM/FBS 
as described herein. Cells were infected with the desired multiplicity of virus in a 
volume of 1 ml at 37 °C for 2 hours. At the end of the incubation, cells were washed 
with PBS twice and cultured in 2 ml of medium. Cells were collected in parallel at 

10 desired points for low molecular weight DNA and RNA extraction. Cells were 
examined for GFP express ion by fluo r escence micro scopy (Olympus, 1X70) or 
microtiter plate reader (PerSeptive Biosystem, CytoFluor IT) before extraction of DNA 
for analysis of chromosomal rearrangement 

DNA analysis of chromosomal rearrangement was performed as follows. 5 fig 

15 of Hirt DNA was digested with Bgl II and analyzed by DNA blot techniques using a 
labeled EBNA-1 gene fragment as probe (Figure 6). The Bgl II fragment from the non- 
circularized AdV is 3162 bp, generated from the 5'end of the AdV to the first Bglll site 
in the AdV. The circularized fragment created from the two loxP sites has a size of 
4915 bp (Figure 6A). Densitometry revealed that at 72 hours post infection, 95% or 

20 more of the input genomes had undergone circularization in HepG2 cells. In contrast, 
low but detectable levels of circularized fragment was visualized in Hela cells infected at 
the same time and at the same multiplicity of infection used for the HepG2 cells (Figure 
6B). 

At the time of infection (t=0, Figure 6B), the amount of input viral DNA detected 
25 by DNA blot was higher for HepG2 cells than for Hela cells when similar virus 
multiplicities were applied (moi of 1,000). This may reflect differences in AdV 
adsorption or infection efficiency between the two cell types, possibly as a result of the 
lower levels of coxsackievirus-adenovirus receptor on the Hela cells surface. To achieve 
similar viral genome input into HepG2 and Hela cells, Hela cells were infected with ten- 
30 fold more virus (moi of -10,000) than HepG2 cells (moi of 1,000). Episomal DNA 

samples were extracted and analyzed by blotting. The results (Figure 6C) indicated that 
when comparable amounts of viral genome are present in the nucleus, the cyclization 
rate in both cell types was similar. Because the level of subsequent GFP expression is 
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much higher in HepG2 cells than in HeLa cells (Figure 7A), it is likely that very small 
amounts of Cre recombinase suffice to promote rearrangement, and that recombinase 
expression is not limiting for rearrangement in either HepG2 or HeLa cells. 

GFP expression cannot be detected until rearrangement has taken place, so the 
5 measurement of the traction of GFP positive cells provided a simple alternate method 
for assessing the degree of productive rearrangement. Figure 7A shows that GFP 
expression developed quickly in transduced HepG2 cells, but that only a few GFP 
positive cells can be detected in Hela cells infected with a ten fold higher moi, 
conditions that allow circularization to a comparable extent to that seen in HepG2 cells 
10 (Figure 6C). 

The HCR12 promoter specificity was also tested by infecting two additional non- 
hepatic cell lines, A431 (human epidermoid carcinoma) and HT29 (human colon 
adenocarcinoma), with the Ad2HCRGFP/EB V vector. Both cell lines were obtained 
from ATCC and cultured using DMEMZFBS as described herein. A few cells, with 

15 weak GFP signal, were detected at 72 hours after infection in these cells (Figure 7B). In 
contrast, these non-hepatic cells could be infected efficiently with a first generation 
AdV, Ad2CMVGFP virus (data not shown), indicating that the low GFP signal was not 
due to the low infectivity of these cells' by AdV. 

To further assess the utility of the AdV genome rearrangement, primary human 

20 hepatocytes were infected with the Ad2HCRGFP/EBV vector. For these experiments, 
primary human hepatocytes, generously provided by Dr. Albert Edge (Diacrin, Inc., 
Charlestown, MA) were isolated and cultured as described by Gunsalus et al. (Nat. Med. 
3:48-53, 1997), infected with adenovirus, and GFP expression was analyzed. As shown 
in Figure 7C, GFP expression was readily detected 72 hours after infection. 

25 

Diminished Viral Gene Expression in Rearranged AdV 

After excision, the adenovirus major enhancer/packaging signal segregates with 
the episomal DNA, yielding a linear fragment containing the remainder of the AdV 
genome without this important cis-element (Figure 5). To assess the impact of enhancer 
30 deletion, PCR amplification and quantitative RT-PCR measurement of late viral gene 
expression was performed as follows. 
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Four \ig of total RNA was reverse transcribed into cDNA using M-MLV RT by a 
standard protocol (Promega). 1 |xl of the cDNA from each sample was used in 
subsequent PCR reactions. PCR primers were designed to amplify the tripartite leader 
sequence of the adenovirus late genes: TPL1 - 5' act etc ttc cgc ate get gt 3* (SEQ ID 
5 NO.: 7) and TPL2 - 5' ctt gcg act gtg act ggt tag 3 r (SEQ ID NO.:8). For detection of the 
AdV genome in the Hirt DNA samples, 1 fig DNA was employed in the PCR 
amplification using the following primers which are specific for the adenovirus DNA in 
the fiber gene: Fiberl - 5' ccg cac cca eta tot tea ta 3' (SEQ ID NO.: 9) and Fiber2- 5' ggt 
gtc caa agg ttc gga ga 3* (SEQ ID NO.: 10). PCR reactions were performed as 95 °C 30 

10 seconds; 54 °C 30 seconds; 72 °C 30 seconds for 30 cycles. All amplified products were 
analyzed ona 2% agar ose gel. 

For quantitative PCR, a molecular beacon based universal amplification and 
detection system was used (Intergen). A common leading sequence (Z sequence, 5' act 
gaa cct gac cgt aca 3') was added to the TPL1 and Fiberl primers. The TPL2 and Fiber2 

15 primers, described above, were used in the quantitative PCR reactions. 1 pi of the 
cDNA and one fig of Hirt DNA from each sample were used in the assay. The PCR 
were carried out in a 96- well spectrofluorometric thermal cycler (Applied Biosystems 
Prism 7700). The number of template molecules in the PCR reaction was calculated 
from the standard curve using linearized plasmid as templates. 

20 As most late adenoviral genes transcripts share a common -200 bp tripartite 

leader sequence (TPL) (Akusjarvi and Persson, Nature 292:420-6, 1981), the TPL 
sequence was chosen as a marker of viral gene expression. HepG2 cells were infected 
with the first generation vectors Ad2CMVGFP and Ad2HCRGFP, or the self-resolving 
vector, Ad2HCRGEP/EBV, using increasing multiplicities of infection. Total cellular 

25 RNA and low molecular weight DNA were isolated in parallel as described by Hirt (J. 
Mol. Biol. 26:365-9, 1967) and total RNA was prepared using RNAzol solution (Tel- 
Test Inc.). RT-PCR was performed to quantitate the amount of RNA encoding the TPL 
in the cDNA samples. PCR amplification of a 201 bp fiber gene fragment from the AdV 
genome was used to detect the amount of viral genome in the DNA samples. A 

30 representative result of three experiments is shown in Figure 8A. TPL sequences were 
detected, 72 hours post-infection, with either 100 or 1000 viruses infected per cell, using 
both of the first generation adenoviruses (upper panel). 
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In contrast, no TPL signal was detected in the self-resolving Ad2HCRGFP/EB V 
infected cells, even at a moi of 100,000/celL PCR amplification of the AdV fiber gene 
revealed comparable levels of AdV genomic DNA in cells infected at comparable moi's. 
(Figure 8A, lower panel). Hie cDNA samples in which the TPL signals were detected 
5 were further analyzed by real-time fluorescence PCR. The corresponding genomic 
DNA samples were also analyzed to determine the number of AdV genomes present in 
each sample. Hie results are summarized in Figure 8B. There were approximately 
lxlO 4 TPL per lxlO 6 AdV genomes detected in the Ad2HCRGFP infected cells, but no 
detectable TPL in the self-resolving Ad2HCRGFP/EBV infected cells. These results 
10 indicate that adenoviral gene expression was dramatically reduced by the separation of 
the viral enhancer sequences occasioned by the re-arrangement of the self-resolving 
vector. 

TYPE A AND TYPE B VECTORS - EXPERIMENTAL RESULTS 

15 Additional experimental examples now follow that further illustrate the general 

approaches of the invention relating to using and constructing type A and type B vectors. 
For generating such adenoviral vectors, DNA sequences important for gene expression 
in the target tissue were placed between two loxP sites. The first loxP site was inserted 
between the Ad2 left-end inverted terminal repeat (TTR) and the enhancer sequence, 

20 replacing a BspLUl II and BstZ17 fragment of Ad2. A target gene expression cassette, 
comprising a promoter, a gene of interest, polyadenylation signals, the EB V replicon, 
and site specific recombinase expression unit were inserted in place of the El locus. 

In type A adenoviral vectors, the second loxP site is placed between TetR and 
VP16, preserving the coding frame of both (Figure 1A). A bidirectional promoter in 

25 which a central hep tamer of tetracycline operator sites (TetO) (Gossen and Bujard, Proc. 
Natl. Acad. Sci. USA 89:5547-5551, 1992) was flanked by two divergently oriented 
basal elements, directs the expression of TetR loxP VP16 from a synthetic TATA 
element, whereas Cre recombinase is controlled by the same heptamer of operator 
upstream of the HTV LTR basal element. 

30 In the case of type B viruses (Figure IB), the second loxP site was inserted in the 

first intron of the Efla gene, which contains the transcription stimulating sequences 
described herein. In addition, a splice acceptor sequence was added to the 5' end of the 
coding sequence of the gene of interest To avoid rearrangement during plasmid 
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construction in bacteria, the Cre recombinase coding sequence was interrupted by the 
addition of die human IgGl hinge-CH2 intron (between amino acids Q78 and A79), as 
described herein. 

5 Designing a Compact EBV Replicon 

Most plasmids employing the EBV latent origin of replication exceed 10 kb in 
length. To provide a means for increasing the capacity of the recombinant adenoviral 
"type A or type B vectors to accommodate a therapeutic gene, a compact EBV replicon 
having episomal stability was designed. To this end, deletions were generated in both 

10 the cis-acting origin of replication, OriP, and the sequences encoding the trans-acting 
replication protein, Epstein Barr virus nuclear antigen- 1 (EBNA-l) (Figure 9A). 
Episomal persistence was assessed with a green fluorescent protein (GFP)-bearing test 
plasmid by determining the fraction of cells retaining green fluorescence as a function of 
time, assuming that the half-life of GFP, in daughter cells that have not received an 

15 episome as a result of segregation failure, is approximately 1.4 days (Fukumura et al., 
Cell 94:715-725, 1998). 

EBNA-l contains a central repeated structure that consists entirely of Gly and 
Ala residues, termed the GA repeats (Figure 9A). Although deletion of this structure has 
been reported to have little consequence, a deletion mutant consisting of both a short 

20 OriP and a short EBNA-l (SoriP + SEBNA1) was generated and found not to support 
plasmid maintenance effectively (approximately 40% loss per cell division). A version 
of this mutant, reconstructed with 40 GA repeats, in which the short OriP was paired 
with a short EBNA-l provided significantly better plasmid stability (20% loss per cell 
division vs. 10 % per cell division for the wild type) (Figure 9B). Since most target 

25 tissues are relatively quiescent mitotically, this level of segregation fidelity provides 
reasonable stability in a compact replicon. 

Producing Cell Lines that Express Cre- or FLP -Dominant Negative Mutants 
As discussed herein, one obstacle to creating adenovirus carrying both 
30 recombinase and target sites has been the difficulty of controlling recombinase activity 
during virus propagation. Since efficient recombinase activity is needed in target cells, 
recombinase activity is best tempered in the production cell line. 
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Vector-independent methods to suppress recombinase activity duiing the 
production phase are attractive because they allow vector design objectives to 
be pursued with fewer constraints. In principle, dominant negative recombinase mutants 
provide the desired antagonism of recombinase activity. Cell lines expressing such 
5 recombinase dominant negative mutants were produced as follows. 

Dominant negative Ore mutants were selected from known point mutants 
(Wierzbicki et al., J. Mol. Biol. 195:785-794, 1987) that are defective in recombination 
function but are likely to retain dimerization function Figures 10A-E. Several mutants 
were screened for their abilities to inhibit Cre activity of a type B vector construct 

10 (ad2239 in Figure 10B) in a transient cotransfection assay. Under these conditions, Cre 
activity is detected by the expression of GFP that occurs upon rearrangement Figures 
10D and 10E show the point mutants that were assessed and their relative activities in 
the transient cotransfection assay. Dilution studies, in which increasing amounts of 
substrate/Cre plasmid were cotransfected with the mutant forms, were conducted and 

15 based on its favorable profile, one mutant recombinase, designated CreY324C, was 
chosen for further development (Figure 10D). 

Strong constitutive expression of CreY324C, under control of the Efl a promoter 
failed to yield stable cell lines. Stable clones were obtained when the Ef la promoter 
was replaced with a tetracycline regulated promoter (Gossen et al., Science 268:1766- 

20 1769, 1995). Clones were then tested for the ability to inhibit Cre enzymatic activities, 
and one clone, designated cell line #17, was selected for additional experiments. When 
a plasmid bearing Cre and capable of undergoing Cre-directed rearrangement to create a 
GFP transcription unit (ad2239) was transfected to #17 cells or parental 2930N cells, 
GFP expression in the #17 cells in the presence of 2pM doxycycline was significantly 

25 lower than those of controls (Figure 1 1), showing that Cre enzyme activity can be 
inhibited in #17 cells. 

In addition to dominant negative Cre mutants, dominant negative FLP mutants 
may also be identified. FLP belongs to the same family of site-specific recombinases as 
Cre recombinase. A number of FLP mutations that show defects in either cleavage or 

30 ligation of FRT sites have been identified. Mutant FLP defective in cleaving FRT site 
(for example, H309L, L315P, G328R, G28E, N329D, S336Y, S336F, A339D, Y343F, 
and H345L) are generated using standard methods. Mutants that inhibit the wild type 
enzyme are then identified for generating stable cell lines according to the methods 
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described above. These and the other cell lines (described herein) are then used for 
producing FRTYFLP containing virus. 

As mentioned above, difficulties creating stable cell lines expressing Cre 
dominant negative mutants were occasionally encountered. This difficulty is not limited 
5 to Cre mutants, but also to the wild-type Cre enzyme. In contrast, 293 cell lines stably 
expressing a thermostable FLP, referred to as FLPe (Buckholz et al., Nat Biotechnol. 
16:657-662, 1998), were created, suggesting that FLPe might not be as cytostatic as Cre 
protein. To demonstrate this, 293 cells were transfected with plasmids expressing either 
Cre or FLPe, and puromycin resistant colonies were selected. To generate stable cell 

10 lines expressing Cre or FLP mutants, 293 TetON cells were transfected with linearized 
plasmid expressing Cre or FLP mutants and puromycin acetyltransferase and selected 
with 1 |ig/ml of puromycin. Puromycin resistant colonies were characterized further for 
their ability to inhibit Cre recombinase using the cre (ad2239) or flp (ad2879) substrate 
plasmids. Table 1 shows that there are more puromycin resistant colonies selected from 

15 FLPe transfected cells than from Cre transfected cells. From this result, it is expected 
that stable cell lines expressing a reasonably high level of dominant negative FLP may 
be readily created. 



TABLE 1 

20 



Puromycin Resistant Colonies Formed When Cre Expressing or FLPe Expressing 
Plasmid was Used to Transfect 293 Cells 



Expression Plasmid 


Number of colonies (2 (ig/ml puromycin) 


Control 


236 


Cre 


92 


FLPe 


127 



25 

Cre or Cre dominant negative mutants were also found to inhibit FLP activity 
(Table 2). Accordingly, cell lines such as cell line #17, that stably express a Cre 
dominant negative mutant (for example, CreY324C), are useful for producing FLP/FRT 
30 carrying adenovirus. 
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TABLE 2 



Cre Inhibition of FLP Activity in trans 



Plasmids 


Arbitrary GEP intensity 


EflaFLP + FLP substrate + vector control 


4.9 


Efla FLP + FLP substrate + Ef la Cre 


0.78 


Efla FLP + FLP substrate + Efla CreR173C 


2.23 



5 

FLP enzyme activity was measured by the GEP intensity by cotransfecting with a FLP substrate 
plasmid, ad2879 (Figure 19 A). GFP intensity was quantified using IP lab software. 



10 Transcriptional Regulation of Cre or FLP Recombinases 

It has been relatively difficult to achieve high-level promoter inducibility in a 
replicating adenovirus. The challenge is similar to that of achieving faithful control of 
transcription in a transient expression setting. One approach to increase the induction 
ratio in a transient setting is the use of auto-regulatory (feed-forward) circuits. One such 

15 system, based on tetracycline dependent activation, is shown in Figure 12. A central 
heptamer of tetracycline promoter operator sites (TetO sites) was placed between two 
divergently oriented basal TATA elements. The leftward TATA controls the expression 
of the TetR-VP16 fusion protein, in which a loxP (or FRT) site has been placed between 
the TetR DNA binding domain and the VP16 transcriptional activator. The rightward 

20 TATA box directs the synthesis of recombinase, either Cre or the yeast FLP enzyme. In 
the presence of tetracycline, the promoter has reduced activity in both directions. Upon 
removal of tetracycline, the synthesis of both TetR-VP16 and recombinase are induced 
(Figure 1 A). The induced recombinase then disjoins the TetR DNA binding element 
from the transcriptional activation contributed by VP16. Any existing TetR-VP16 

25 fusions thereafter promote transcription of TetR, which competes with TetR-VP16 for 
TetO, resulting in deinduction of recombinase transcription. 

When a model target cell line, HepG2, was tested with this type of adenovirus, 
the efficiency of circularization was low relative to that seen in 293 cells (data not 
shown), indicating a cell dependence of the bidirectional TetO promoter. To correct 

30 this, the TATA element of the TetO synthetic promoter (derived from the CMV 

immediate early promoter) was replaced with that of the HIV LTR. Constructs bearing 
differing components of the HIV basal promoter were analyzed for strength and 
regulation in 293 and HepG2 cells (Figure 13A). Among the constructs tested, one 
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version bearing the HIV LTR TATA and Spl elements (D in Figure 13A) showed the 
least basal expression in 293 cells (data not shown) and the greatest induction in HepG2 
cells (Figure 13B). 

Using this promoter, a construct (ad3400) containing the autoregulatory 
5 structure D as shown in Figure 13A was engineered, and Ore activities in the presence 
and in the absence of tetracycline were assayed. Plasmid ad2265 (Figure 14) in which a 
blue fluorescent protein (BFP) expression unit is interrupted by two loxP sites and 
transcription termination sequences was used as a substrate for Cre. Cre-mediated 
recombination joins BFP to the promoter resulting in BFP expression. As shown in 
10 Table 3, no difference was found in the intensity of BFP expression, either in the 

presence or absence of tetracycline. One possible explanation for this is that very little 
Cre protein is required for activity. Consistent with this idea, standard 
imunohistochemical techniques failed to reveal the presence of Cre enzyme in cells that 
were fully induced (data not shown). 

15 

TABLE 3 

Cre Recombinase Activity Regulation in Type A Constructs 

20 



Construct 


Cre Form 


+tet -tarn 


-tet-tam 


+tet+tam 


-tet+tam 


ad3400 


Cre 


2.58 


3.39 


2.71 


6.20 


ad4394 


Cre-LBD 


0.081 


0.22 


0.056 


1.02 


ad4705 


LBD-Cre-LBD 


ND 


ND 


ND 


ND 



Cre enzyme activity was measured in the presence or absence of the ligands, tamoxifen, by 
cotransfecting with the substrate plasmid (ad2265). BFP intensity (mean intensity/area) was 
quantified by analyzing fluorescent images captured by a digital camera using IP lab software. 
25 ND, refers to fluorescent intensities that were too weak to measure. Tet, Tetracycline; tarn, 

tamoxifen. 



Deletion of the PolyA Consensus Sequence from Cre or FLP Transcription Units 
30 To reduce the expression of FLP or Cre recombinase further, the consensus 

polyA addition signals from the Cre or FLP transcript unit were deleted from vector 
constructs, leaving polyadenylation dependent on distal downstream sequences, for 
example, in gene IX. The activity of Cre using type B proviral constructs with or 
without the polyA signal was measured. As shown Table 4, the construct without polyA 
35 signals (AD229.3) showed a significant reduction of GFP intensity compared to a 
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construct bearing the polyA signal (AD230.5). When FLPe constructs of similar 
structure were evaluated, similar results were found (data not shown). These data show 
that Cre and FLPe enzyme activity levels can be modulated by attenuating 
polyadenylation. 

5 

TABLE 4 

Effect of Deleting polvA Addition Signal From the Cre Expression Unit on Cre Enzyme 

Activity Level 

10 





polyA 


Relative GFP Intensity 


AD229.3 




0.25 


AD230.5 


+ 


1 









Post-Transcriptional Regulation of Cre Recombinase Activity 

Post-transcriptional control mechanisms of Cre recombinase activity were also 

15 evaluated. Translational fusions between Cre and the ligand binding domain (LBD) of 
estrogen receptor have been reported to be regulated by estrogen (Feil et aL f Proc. Natl. 
Acad. Sci., U.S.A 93:10887-10890, 1996; Gossen et al., Proc. Nad. Acad. Sci., U.S.A. 
89:5547-5551, 1994), or, in the case of mutant estrogen receptors (Metzger et al., Proc. 
Natl. Acad. Sci. U.S.A. 92:6991-6995, 1995), by the partial antagonist tamoxifen. 

20 Use of a ligand-dependent recombinase (ad4394 in Table 3), in combination with 

the HIV LTR-based autoregulated Tet system, allowed for a small degree of regulation 
by tetracycline, but not by ligand, as assayed using the ad2265 rearrangement assay 
(Table 3). One interpretation of this finding is that fusion of the estrogen receptor LBD 
to Cre provides only modest control of recombinase activity, but attenuates enzyme 

25 potency to a level so that transcriptional regulation can be measured. 

To increase control of recombinase activity, the LBD was fused both to the N- 
terminus and C-tenninus of Cre (LBD-Cre-LBD) and inserted into the coding sequence 
of both type A and type B vectors. When the LBD-Cre-LBD construct of type A was 
transfected into 293 cells, it showed no significant Cre enzyme activity even in the 

30 presence of ligand (Table 3). This result confirmed that the Cre recombinase activity is 
attenuated by N-tenninal or C-terminal extension. 
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When the LBD fusion Cre enzymes were assayed in the type B vector context, 
only LBD-Cre-LBD fusions (pk8-ad4626) showed ligand-dependent regulation of Cre 
enzyme activities (Table 5). It appears that attenuated Cre activity in LBD-Cre-LBD, in 
the absence of ligand, is low enough to fall below the upper limit of the Cre assay. 

5 

TABLE 5 



Cre Enzyme Activities of Type B Provirus 



Cre Form 


Provirus 


-tarn 


+tam 


Cre 


pk8-ad2239 


ND 


ND 


Cre-LBD 


pk8-ad4332 


4.1 


6 


LBD-Cre-LBD 


pk8-ad4626 


0.05 


4.2 



10 

Cre enzyme activity was measured in the presence or absence of the ligands, tamoxifen. GFP 
intensity was quantified using IP lab software, 
tain, tamoxifen. 



15 

Consistent with this notion, only the construct carrying two LBDs, pk8- ad4626, 
was able to produce virus (AD121.5) by transfection and propagate in 293 cells, while 
pk8-ad4332, which carried one LBD, produced virus (AD100.9) initially (following 
transfection of the cognate DNA) but was unable to propagate in 293 cells (Table 6). lu 
20 the case of wild type Cre, no virus was produced in 293 cells by transfection. 



TABLE 6 



Production and Propagation of Type B Adenovirus 
25 



Cre Form 


TypeB 
adenovirus 


Viral 
Production in 
293 cells 


Viral 
Propagation in 
293 cells 


Viral 
Propagation in 
#17 cells 


Cre 


Pack8-2239 








Cre-LBD 


AD100.9 
(Pack8-4332) 


+ 




+ 


LBD-Cre-LBD 


AD121.5 
(Pack8-4626) 


+ 


+ 


+ 



30 
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The AD100.9 virus was able to propagate in #17 cells expressing the dominant 
negative Ore Y324C, demonstrating that modulation of Cre activity is important for viral 
production. Thus, adenovirus carrying both two loxP sites and Cre in two different 
configurations were generated by controlling Cre activity. 

5 

Viral Rearrangement in Culture 

Cre/loxP mediated rearrangement of the adenovirus in tissue culture cells has 
also been analyzed. As shown in Figure 15A, the AD121.5 virus showed a significant 
increase in GEP expression in the presence of the ligand, estrogen, suggesting a 
10 successful rearrangement of the virus by Cre recombinases. When non-chromosomal 
DNA (Hirt, /. Mol Biol 40: 141-144, 1969) was made from the cells and analyzed by 
DNA blot analysis, the vi^DNA'fiom^trogM'to 

circular form (C in Figure 15B), while the DNA from cells not treated with estrogen was 
found mainly in linear form (L in Figure 15B). 

15 To evaluate the efficiency of the self-rearranging viruses in vivo, high titer stocks 

of AD102.7 (a type A virus carrying LBD-Cre, pk8-ad4394) in #17 cells was prepared 
and purified by CsCl gradient ultracentrifugation. The titer of AD102.7 (4 -6 x 10 12 /ml 
by OD) is comparable to or slightly exceeds that of control viruses (2 -4 x 10 12 /ml by 
OD) which carry neither Cre nor a loxP site. To determine the efficiency of viral 

20 rearrangement in vivo and whether such rearrangement is dependent on the presence of 
ligand, AD102.7 virus (4 x 10 11 pfu/mouse as determined by optical density) were 
injected via tail vein into Rag-2 mice that were pretreated with vehicle alone or 1 10 
fig/day of tamoxifen for 7 days as follows. 

Rag-2 mice were injected with either PBS (mock) or 4 X 10 1 1 adenovirus 

25 particles (as determined by OD260) of type A virus, AD 102.7, via the tail vein. At 

various times after injection, animals were sacrificed and the liver tissues were removed 
and frozen rapidly on dry ice. To visualize GFP expression in animal tissues, mice were 
anaesthetized and perfused with 4% paraformaldehyde containing 0.2% glutaraldehyde 
intracardially (Kafri et al., Natl. Genet 17:314-317, 1997), and the liver tissues were 

30 removed and fixed overnight at room temperature in the perfusion buffer containing 
30% sucrose. The fixed tissues were sectioned serially and observed under confocal 
scanning laser microscopy. In experiments evaluating the responses of ligand-regulated 
recombinase, mice were injected either with vehicle (vegetable oil) alone or with 110 
jig/day of tamoxifen for 7 days prior to adenoviral injection. 
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Liver tissues from these animals were harvested at 2.5 hrs post injection (the 
earliest time point taken after injection) and Hirt DNA from approximately 250 mg of 
frozen hepatic tissue was prepared and analyzed by blot analysis. As shown in Figure 
16, the majority of adenoviral DNA was found in circular form in tissues from untreated 
5 mice, as well as tamoxifen-treated mice. It can be concluded from these data that the 
Cre enzyme activity present in the tissue, even in the absence of ligand was sufficient for 
efficient self rearrangement of virus. As expected, the hepatic tissues from the Rag2 
mice injected with AD102.7 showed strong expression of GFP (Figure 17). 

Demonstrating that AD102.7 virus, produced efficiently in 293 cells at high titres 
10 by the conventional means, can self rearrange efficiently in vivo provides the proof of 
the concept that potentially safer adenoviral gene therapy vectors can be produced. 

Adenoviruses Carrying Both FRT and FLP Recombinase 

Type A and type B proviral constructs carrying both FRT (FLP recombinase 
15 recognition site) and FLP recombinase were also generated. Structures of these viruses 
are analogous to those of loxP/Cre carrying viruses except that loxP sites are replaced by 
FRT sites and Cre coding sequence is replaced by FLP coding sequence (Figures 18A, 
18B). 

20 Virus Production at Reduced Temperature 

Temperature dependence of the Efl a promoter using GFP expression as a 
marker was also examined. As shown in Table 7, Ef la promoter activity is strongly 
reduced at 32°C in comparison to 37°C or 39°C. Hie temperature sensitive nature of the 
Efl a promoter was used to propagate type B adenovirus carrying FLP at 32°C following 

25 initial production of the virus by DNA transfection (pk8-ad3302) at 37°C. HepG2 cells 
infected with these viruses (AD41.4) showed strong GFP expression, but with an 
approximately 12 hr delay compared to GFP expressing viruses, suggesting that FLP 
recombinase activity may be impaired at 37°C. To improve the activity of FLP 
recombinase, viral constructs were created using a thermostable FLP (referred to as 

30 "FLPe") described by Buchholz et cd. (Nat Biotechnol. 16:657-662, 1998). 
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TABLE 7 



Effects of Temperature on Efla Promoter Strength as Shown by GFP Intensities 



Tester plasmids 


Arbitrary GFP Intensities 




32°C 


37°C 


39°C 


Efla GFP 


16hrs 


1475 


7886 


11409 


41 hrs 


6472 


36699 


50787 


86hrs 


16256 


53370 


54424 


EflaCre+ad2204 


16 his 


243 


1141 


2132 


41 hrs 


1094 


9119 


9784 


86 hrs 


695 


3219 


8144 



5 GFP intensities were measured using a Fluorescent reader. 



The activities of FLP and FLPe using-a FlJ^substrate plasmid (ad2879 j Figiire 
19A) in 293 cells were compared. As shown in Table 8, FLPe is significantly more 
10 active than FLP under these conditions. 

TABLE 8 



FLPe is Significantly More Active than FLP Recombinase 
15 



Plasmid 




Mean GFP intensity 


ad4821 


Efla FLPe 


2.39 


ad2949 


Efla FLP 


0.01 



Plasmid coding either FLPe or FLP was cotransfected with a FLP substrate plasmid (Figure 19 A) 
into 293 cells. GFP intensity of each transfection was measured using IP lab program. 



In addition, a tamoxifen-regulated FLPe was created by fusing the ligand-binding 
domain from a mutant form of estrogen receptor to the FLPe coding sequence at its C- 
terminus (FLPe-LBD). Hie FLPe-LBD was found to be regulated by the ligand, 
tamoxifen (Table 9). Although FLP activity was retained by C-terminal fusion (FLP- 
25 LBD), addition of a short oligopeptide tag to the N-terminus of FLP abolished its 
activity (data not shown). 
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TABLE 9 



Tamoxifen Regulation of FLPe as Determined by GFP Intensities 



Plasmid 


FLPe 


-tarn 


+ tam 


ad4821 +ad2879 


FLPe 


++++ 


4-H-+ 


ad5022+ad2879 


FLPe-LBD(tam) 


+ 


4-H- 



5 

GFP intensity resulting from FLPe mediated recombination was measured using a 
fluorescent microscope, tarn, 2 fig/ml tamoxifen. 



10 Inhibition of FLPe Activities bv Anti-sense FLPe 

An anti-sense approach to inhibit FLP enzyme activity was also employed. This 
approach tested the notion that incorporation of an open reading frame into an antisense 
transcript would stabilize the transcript and potentiate antisense activity. Two 
approaches were utilized. In one approach, the BFP coding sequence was placed 

15 upstream of anti-FLPe. In the second approach, an anti-FLPe was placed upstream of an 
internal ribosome entry sequence (IRES) and the BFP coding sequence (Figure 20). The 
ability of these constructs to inhibit FLPe was assayed using a FLP substrate plasmid, 
ad2879 (Figure 19A) and the result is shown in Figure 21. These data show that anti- 
sense FLPe is more effective in inhibiting FLPe function when it is fused to BFP, which 

20 can presumably be replaced with any other stable protein. 

OTHER SELF-REARRANGING ADENOVIRUSES 

Mixed Infection With Adenoviruses Carrying loxP/FLP and FRT/Cre 
25 One of the ways to produce adenoviruses that can be rearranged in target cells 

but not producer cells is to engineer two separate viruses, each carrying one recombinase 
and the target sequence for the other. To test this system, type B adenoviral constructs 
carrying Cre recombinase and FRT sites and FLPe recombinase and loxP sites were 
created (Figure 22). In target cells infected with both viruses, Cre catalyzes 
30 recombination between the two loxP sites in the FLP virus, and FLP carries out FRT 
mediated recombination in the Cre virus, resulting in two circular plasmids. The loxP 
virus contained BFP, whereas the FRT virus contained GFP. Measurement of the 
fluorescent intensities of GFP and BFP, after cotransfecting the two constructs, revealed 
that BFP expression (mediated by Cre enzyme) was greater than GFP expression 



-37- 



WO 02/20814 



PCT/US01/27682 



(mediated by FLP enzyme), suggesting that the Cre enzyme functions more efficiently 
thanFLP. 

Accordingly, these two recombinase activities in the target cells need to be 
balanced for complete circularization of both viral vectors. Exemplary methods for 
5 modulating Cre /FLP activity include the use of transcriptional regulation (such as by 
varying promote strength and/or with or without poly A addition signal sequence) and 
translational and/or post-translational regulation (such as by changing FLP to FLPe and 
making LBD fusion proteins), and post viral production control (such as by changing the 
ratio of two viruses). 

10 In one approach, Cre was replaced by Cre-LBD and FLP was replaced by FLPe. 

To improve identification of the rearrangement products, BFP was replaced with RFP as 
a marker for Cre recombination. As shown in Table 10, in the presence of estrogen, 
expression of GFP (FLPe mediated) and RFP (Cre mediated) were similar. 

15 TABLE 10 



RFP and GFP Expression of Cells Cotransfected With Type B Proviral Constructs 
Carrying Cre-LBD/FRTf GFP) or FLPe/loxP (RFP) 



Plasmids 


Genotypes 


GFP intensity RFP intensity 


pk8-adsl20 
+pk8-adsll3 


Cre-LBD/FRT (GFP) 
+ FLPe/loxP (RFP) 


Estrogen 




+ 




+ 


3222 


3183 


46 


1954 



20 

To insure that both Cre and FLP carrying viruses with an optimal ratio infect 
each target cell, these viruses can be cross-linked prior to infection. For example, Cre 
carrying virus is labeled by biotin while FLP carrying virus is labeled by avidin. Mixing 

25 two types of modified viruses generates virus complexes of desired proportions as well. 
Biotinylation or avidinylation can be carried out using commercially available reagents 
such as EZ-Link TFP-PEO biotin (Pierce) and EZ-Link maleimide activated 
NeutrAvidin (Pierce). The extent of the biotin/virus and avidin/virus will be empirically 
determined to ensure the viability of the virus and to obtain an optimal ratio of two 

30 viruses in the complex. Optimal ratios will be those resulting in 1 : 1 Cre and FLP 
recombinase activities in target cells. The modifications will be done following 
manufacture's instructions. 
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This approach not only increases the effective capacity of adenoviral vector but 
also opens new avenue of applications involving multiple proteins, some of which 
cannot be coexpressed in production cell line as a result of combination toxicity. 

All references mentioned herein are hereby incorporated by reference. 
Other embodiments are within the claims. 

What is claimed is: 



10 
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Claims 

1. A replicatable viral DNA vector encoding a site-specific DNA-altering 
enzyme and a DNA target recognized by said enzyme, said enzyme selectively 

5 converting, in a cell expressing said enzyme, said DNA vector to a rearranged 

form. 

2. Hie vector of claim 1, wherein said rearranged form comprises an 
autonomously replicating episome. 



10 



15 



3. The vector of claim 1, wherein said rearranged form comprises linear and 
circular DNAs. 

4. Hie vector of claim 1, wherein said vector comprises adenoviral DNA. 

5. The vector of claim 1, wherein said vector comprises a genetically- 
engineered recombination site. 



6. Hie vector of claim 5, wherein said recombination site comprises a target 
20 ofCreorFLP. 

7. Hie vector of claim 1, wherein said enzyme comprises a recombinase or 
an integrase. 

25 8. The vector of claim 7, wherein said recombinase is Cre or FLP 

recombinase. 

9. The vector of claim 1, wherein said enzyme is functional in a mammalian 
cell. 



30 



10. The vector of claim 5, wherein said recombination site comprises a 
recognition sequence of a site-specific DNA-altering enzyme. 
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11. Hie vector of claim 1, wherein said vector comprises an origin of 
replication functioning in a mammalian cell. 

12. The vector of claim 1 1, wherein said origin of replication is an Epstein 
5 Barr Virus replicon. 

13. The vector of claim 1, wherein said vector comprises a gene of interest. 



14. A method for assembling a recombinant adenoviral DNA said method 
10 comprising the steps of: (a) providing a first linearized DNA vector comprising a 

restric t ion site and a cos site and a second linearized DNA vector comprising 
said restriction site, an adenoviral nucleic acid molecule, and a cos site; and (b) 
ligating said first and second linearized DNA vectors, said ligation assembling a 
recombinant adenoviral DNA. 

15 

15. The method of 14, wherein said first linearized DNA vector comprises a 
selectable marker. 



16. The method of claim 14, wherein said first linearized DNA vector 
20 comprises an adenoviral left end-inverted terminal repeat. 

17. The method of claim 14, wherein said first linearized DNA vector 
comprises a gene of interest. 

25 18. The method of claim 14, wherein said second linearized DNA vector 

comprises a selectable marker. 

19. The method of claim 14, wherein said second linearized DNA vector 
comprises an adenoviral right-end inverted terminal repeat. 

30 

20. The method of claim 14, said method further comprising packaging said 
assembled adenoviral DNA into a phage and infecting a host cell. 
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21. Hie method of claim 14, wherein said first and second linearized DNAs 
comprise a cosmid vector. 

22. Hie method of claim 14, wherein said adenoviral DNA is flanked by 
5 cleavage sites. 

23. Hie method of claim 22, wherein said cleavage sites comprise intron 
endonuclease cleavage sites. 

10 24. An adenovirus producer cell comprising a nucleic acid molecule that 

expresses a dominant negative site-specific DNA -altering enzyme. 

25. The producer cell of claim 24, wherein said site-specific DNA altering 
enzyme is a dominant negative recombinase. 



15 



26. The producer cell of claim 25, wherein said recombinase is a Cre or Hp 
recombinase. 



27. The producer cell of claim 26, wherein said dominant negative 
20 recombinase is CreY324C. 

28. Hie producer cell of claim 26, wherein said Flp recombinase is Hpe. 

29. The producer cell of claim 24, wherein said cell is a 293 human 
25 embryonic kidney cell. 

30. A vector comprising, in the 5* to 3' direction, 

a first genetically engineered cis-actmg target recognized by a site-specific DNA 
altering enzyme; 
30 a gene of interest; 

a lineage-specific gene promoter; 

a second genetically engineered cis- acting target recognized by a site- 
specific DNA altering enzyme; and 

a nucleic acid molecule encoding a site-specific DNA altering enzyme. 

35 
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31. A vector comprising, in the 5' to 3' direction, 

a first genetically engineered cis-acting target recognized by a site- 
specific DNA altering enzyme; 
a gene of interest; 

5 a bi-directional promoter, comprising a second genetically engineered 

cis-acting target recognized by a site-specific DNA altering enzyme; and 

a nucleic acid molecule encoding a site-specific DNA altering enzyme. 



32. A method of gene therapy comprising the administration to a patient in 
10 need of gene therapy a therapeutically effective amount of the vector of any one 

of claims 1, 30, or_31 which is exp ressed in said patient 



15 



33. A population of cells transfected with the vector of any one of claims 1, 
30, or 31. 

34. A method of gene therapy comprising the administration to a patient in 
need of gene therapy a therapeutically effective amount of the population of cells 
of claim 33. 



20 
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SEQUENCE LISTING 
<110> The General Hospital Corporation 
<120> Self -rear ranging DNA vectors 



<130> 00786/352WO3 

<150> US 60/231,053 
<151> 2000-09-08 

<150> US 60/246,904 
<151> 2000-11-08 

<160> 10 

<170> FastSEQ for Windows Version 4.0 

<210> 1 
<211> 2341 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> derived from Adenovirus 
<400> 1 

gaatcggcca gcgcgaattc gattatcatc atcataatat accttatttt ggattgaagc 60 
caatatgata atgagggggt ggagtttgtg acgtggcgcg gggcgtggga acggggcggg 120 
tgacgtaggt tttagggcgg agtaacttgc atgtattggg aattgtagtt tttttaaaat 180 
gggaagttac gtacgcggca tcgatgcgcg ggatatcgcg gcggctagcg acatgaggtt 240 
gccccgtatt cagtgtcgct gatttgtatt gtctgaagtt gtttttacgt taagttgatg 300 
cagatcaatt aatacgatac ctgcgtcata attgattatt tgacgtggtt tgatggcctc 360 
cacgcacgtt gtgatatgta gatgataatc attatcactt tacgggtcct ttccggtgat 420 
ccgacaggtt acggggcggc gacctcgcgg gttttcgcta tttatgaaaa ttttccggtt 480 
taaggcgttt ccgttcttct tcgtcataac ttaatgtttt tatttaaaat accctctgaa 540 
aagaaaggaa acgacaggtg ctgaaagcga ggctttttgg cctctgtcgt ttcctttctc 600 
tgtttttgtc cgtggaatga acaatggaag ttaacggatc caggccgcga gcaaaaggcc 660 
agcaaaaggc caggaaccgt aaaaaggccg cgttgctggc gtttttccat aggctccgcc 720 
cccctgacga gcatcacaaa aatcaacgct caagtcagag gtggcgaaac ccgacaggac 780 
tataaagata ccaggcgttt ccccctggaa gctccctcgt gcgctctcct gttccgaccc 840 
tgccgcttac cggatacctg tccgcctttc tcccttcggg aagcgtggcg ctttcfccata 900 
gctcacgctg taggtatctc agttcggtgt aggtcgttcg ctccaagctg ggctgtgtgc 960 
acgaaccccc cgttcagccc gaccgctgcg ccttatccgg taactatcgt cttgagtcca 1020 
acccggtaag acacgactta tcgccactgg cagcagccac tggtaacagg attagcagag 1080 
cgaggtatgt aggcggtgct acagagttct tgaagtggtg gcctaactac ggctacacta 1140 
gaagaacagt atttggtatc tgcgctctgc caaagccagt taccttcgga aaaagagttg 1200 
gtagctcttg atccggcaaa caaaccaccg ctggtagcgg tggttttttt gtttgcaagc 1260 
agcagattac gcgcagaaaa aaaggatctc aagaagatcc tttgatcttt tctacggggt 1320 
ctgacgctca gtggaacgaa aactcacgtt aagggatttt ggtcatcaga ttatcaaaaa 1380 
ggatcttcac ctagatcctt ttaaattaaa aatgaagttt taaatcaatc taaagtatat 1440 
atgagtaaac ttggtctgac agttaccaat gcttaatcag tgaggcacct atctcagcga 1500 
tctgtctatt tcgttcatcc atagttgcct gactccccgt agtgtagata actacgatac 1560 
gggagggctt accatccggc cccagtgctg caatgatacc gcgtgaccca cgctcaccgg 1620 
ctcctgattt atcagcaata aaccagccag ccggaagtgc cgagcgcaga agtggtcctg 1680 
caactttatc cgcctccatc cagtctatta gttgttgccg ggaagctaga gtaagtagtt 1740 
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cgccagttaa tagttttcgc aacgttgttg 
cgtcgtttgg tatggcttca ttcagctccg 
cccccatgtt gtgcaaaaaa gcggttagct 
agttggccgc agtgttatca ctcatggtta 
tgccatccgt aagatgcttt tctgtgactg 
accgcgccac atagcagaac tttaaaagtg 
aaactctcaa ggatcttacc gctgttgaga 
aagtgatctt ctgcatcttt tactttcacc 
caaaatgccg caaaaaaggg aataagggcg 
ctttttcaat attattgaag catttatcag 

g 



ccattgctac aggcatcgtg gtgtcacgct 1800 
gttcccaacg atcaaggcga gttacatgat 1860 
ccttcggtcc tccgatagtt gtcagaagta 1920 
tggcagcact gcataattct cttactgtca 1980 
gtgagtattc aaccaagaat acgggataat 2040 
ctcatcattg ggaaacgttc ttcggggcga 2100 
tccagttcga tgtaacccac tcgcgcaccc 2160 
agcgtttctg ggtgagcaaa aacaggaagg 2220 
acacggaaat gttgaatact catacttttc 2280 
ggttattgtc tcatcagcgg atacatattt 2340 

2341 



<210> 2 
<211> 34616 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> derived from Adenovirus 
<400> 2 

gaatcggcca gcgcgaatta actataacgg 
cttattttgg attgaagcca atatgataat 
gcgtgggaac ggggcgggtg acgtaggttt 
ttgtagtttt tttaaaatgg gaagttacgt 
gaagttgtgg gttttttggc tttcgtttct 
gttttttgtg gactttaacc gttacgtcat 
ttggcccttt ttacactgtg actgattgag 
aggttttttt actggtaagg ctgactgtta 
ttctggagcg ggagggtgct attttgccta 
ttttctctcc tattaatttt gttatacctc 
tgcgggtatg tattcccccg ggctatttcg 
aacctgatgt gtttaccgag tcttacatta 
tggtgctttt taatcacggt gaccagtttt 
gtcttatgct tataagggtt gtttttcctg 
ttttttttgt tattttattt tgtgtttaat 
aatggtgtct ttttctgtgg tggttccgga 
ctacgatgtg cttgcttttt tgcgcgaggc 
ttttatatcg ccgcccatgc aacaagctta 
gagtatgcgt gtcataatca gtgtgggttc 
cgcgctggtc cgtgcagacc tgcacgatta 
ggatcgcggt atttttgtta atgttccgct 
tgaatttttg caatcatgat tcgctgcttg 
atttttacaa tggccggact taatattcgg 
cgagatgaaa attatttggg catggttgaa 
cctgaagggt ttagccttta cgtccacttg 
attgtgcaac atcttacaaa tgccattatc 
accggagggg agcgcgttca cttaatagat 
gaataaaaaa aaaaaaaaca tggttcttcc 
gcagaacgaa tgtgtaggtt ggctgggtgt 
ggcagcggcg catgaaggag tttacataga 
gagagagtgg atatactaca actactacac 
cagatctgtt tgtcacgccc gcacctggtt 
ttccatttgg catgacacta cgaccaacac 
gtagggatcg cctacctcct tttgagacag 
cgctgctgcc cgaatgtaac actttgacaa 
cctgcagtgt gggatttacg ctgattcagg 
cgcgggagga gcttgtaatc ctgaggaagt 



tcctaaggta gcgtcatcat cataatatac 60 
gagggggtgg agtttgtgac gtggcgcggg 120 
tagggcggag taacttgcat gtattgggaa 180 
atcgtgggaa aacggaagtg aagatttgag 240 
gggcgtaggt tcgcgtgcgg ttttctgggt 300 
tttttagtcc tatatatact cgctctgtac 360 
ctggtgccgt gtcgagtggt gttttttaat 420 
tggctgccgc tgtggaagcg ctgtatgttg 480 
ggcaggaggg tttttcaggt gtttatgtgt 540 
ctatgggggc tgtaatgttg tctctacgcc 600 
gtcgcttttt agcactgacc gatgttaacc 660 
tgactccgga catgaccgag gaactgtcgg 720 . 
tttacggtca cgccggcatg gccgtagtcc 780 
ttgtaagaca ggcttctaat gtttaaatgt 840 
gcaggaaccc gcagacatgt ttgagagaaa 900 
acttacctgc ctttatctgc atgagcatga 960 
tttgcctgat tttttgagca gcaccttgca 1020 
cataggggct acgctggtta gcatagctcc 1080 
ttttgtcatg gttcctggcg gggaagtggc 1140 
tgttcagctg gccctgcgaa gggacctacg 1200 
tttgaatctt atacaggtct gtgaggaacc 1260 
aggctgaagg tggagggcgc tctggagcag 1320 
gatttgctta gagacatatt gataaggtgg 1380 
ggtgctggaa tgtttataga ggagattcac 1440 
gacgtgaggg cagtttgcct tttggaagcc 1500 
tgttctttgg ctgtagagtt tgaccacgcc 1560 
cttcattttg aggttttgga taatcttttg 1620 
agctcttccc gctcctcccg tgtgtgactc 1680 
ggcttattct gcggtggtgg atgttatcag 1740 
acccgaagcc agggggcgcc tggatgcttt 1800 
agagcgagct aagcgacgag accggagacg 1860 
ttgcttcagg aaatatgact acgtccggcg 1920 
gatctcggtt gtctcggcgc actccgtaca 1980 
agacccgcgc taccatactg gaggatcatc 2040 
tgcacaacgt gagttacgtg cgaggtcttc 2100 
aatgggttgt tccctgggat atggttctga 2160 
gtatgcacgt gtgcctgtgt tgtgccaaca 2220 
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ttgatatcat gacgagcatg atgatccatg 
gttccagtcc cggttccctg cagtgcatag 
ggatggtggt ggatggcgcc atgtttaatc 
attacaacat gccaaaagag gtaatgttta 
taatctacct gcgcttgtgg tatgatggcc 
ttggatacag cgccttgcac tgtgggattt 
actgtgctga tttaagtgag atcagggtgc 
tgctgcgggc ggtgcgaatc atcgctgagg 
cggagcggcg gcggcagcag tttattcgcg 
tgcacgatta tgactctacc cccatgtagg 
aaccgcaagt tggacagcag cctgtggctc 
agctgcccgg ggagtttatt aatatcactg 
ggaatataac acctaagaat atgtctgtta 
ggggagaaag gactgtgtac tctgtgtgtt 
ttctgtgagt ttgattaagg tacggtgatc 
tactgaatga aaaatgactt gaaattttct 
acatgcaaca ggttcacgat tctttattcc 
gtagcaaaag tttcagtggt gtattttcca 
taagtgctta cctcgctag t t tctgtgga t 
atcataggtt tagttttatc accatgcaag 
gtttgacttt gggtttttgg ataggctaga 
tatggatttg cattagttga gttcccattt 
gagttctcca ttagaacacc gttttggtca 
gtgcctgtca tggatgaaag atctccagat 
fcgactcccac attttgtaag aaccaaagta 
cagttaggag atgggtctgg ggttgtccac 
attgtaatgg cccctgagtt gtcaaagctt 
ccgttttcat tgtaatcaat gccagagcca 
gactcagatg tgtttgtatc aaactccaga 
ttatcaaagt ttagtccact ggattttttt 
gatgcattaa aaaggtatag gcctctgtta 
atatacaggg gtccctgccc cagtttaaga 
tccacatcta gaattaacaa gttgttattt 
atgttgtttg atgaatcata accaatagct 
tcaacggtga cacctggtcc agtaactact 
aaaggaccgc ttattttaat tcctattttt 
ttaatgccca agctacccgt ggcagtagtt 
tcgctgtcac tgccagagag gggggctgat 
gtaatgggcc ctttagtagc aatgcttagt 
gactgtacgc taagagcgcc gctagtaact 
gcgcctgagg taattgtaag tggtgcggag 
ttaagtggct gagtaacagt ggttacattt 
gtaagaccgc tgcccatttt aagcgcaagc 
acgcgtagag agagaactcc agggggactt 
ggggtaagaa agggcacagt tggaggcccg 
aaggtgtctt cagacggtct ggcgcgtttc 
gagggacaag aacatgagga atttgacatc 
aaaaggcggc tgagatacca gagttgggag 
gacaaagatt tgctgactga ttttaagtaa 
tggaataaga tctctaatac cacacatggt 
ctgatagggg aagtgcaggc agccctctgt 
tttttctccc accataagca ccagtttttg 
gttgccggta gtggtttttt cgtaggtaag 
tcttttacac tggtgtaggt taaccatgtc 
ggacgccgcc ttgcgccttt ctagtaggcg 
atctagagat tcagtcatct ccacctgtca 
gagaaggggg gcgaggctga ttgattgggg 
gcactgggaa agtagggtgg ttcatggcat 



gttacgagtc ctgggctctc cactgtcatt 2280 
ccggcgggca ggttttggcc agctggttta 2340 
agaggtttat atggtaccgg gaggtggtga 2400 
tgtccagcgt gtttatgagg ggtcgccact 2460 
acgtgggttc tgtggtcccc gccatgagct 2520 
tgaacaatat tgtggtgctg tgctgcagtt 2580 
gctgctgtgc ccggaggaca aggcgtctca 2640 
agaccactgc catgttgtat tcctgcagga 2700 
cgctgctgca gcaccaccgc cctatcctga 2760 
cgtggacttc cccttcgccg cccgttgagc 2820 
agcagctgga cagcgacatg aacttaagcg 2880 
atgagcgttt ggctcgacag gaaaccgtgt 2940 
cccatgatat gatgcttttt aaggccagcc 3000 
gggagggagg tggcaggttg aatactaggg 3060 
aatataagct atgtggtggt ggggctatac 3120 
gcaattgaaa aataaacacg ttgaaacata 3180 
tgggcaatgt aggagaaggt gtaagagttg 3240 
ctttcccagg accatgtaaa agacatagag 3300 
tcactagtgc cattaagtgt aatggtaagt 3360 
taaacttgac tgacaatgft aTttttragca 3420 
aggttaggca taaatccaac tgcatttgtg 3480 
ctaaagttcc agtaatgttt tttaagtgag 3540 
aatctaagga atatactaac acttgcaacg 3600 
acagccaaag cagctacagt agctagtact 3660 
aatttgcagt cattatctga atgaattctg 3720 
agggtaagtt tgtcatcatt tttgtttcct 3780 
aaacccgctc caagtttagt aatcatggca 3840 
attttagttt ttattgggtt gatatctgga 3900 
ccctttcctg catttatagc tatggcagta 3960 
atgctaactt ccagtttttt agtattgttt 4020 
tagtttatgt ccaagttatg agatgcatta 4080 
cgtagttttg tttgagcatc aaatgggtaa 4140 
atacgcatgc caccgcccgt tttaatttcc 4200 
cctgcaactt tggttctaag ggagttttgt 4260 
gttagtgtat cggagttttg tgctacttgc 4320 
ccattattta cataaatagg atcttccatg 4380 
agcgggggtg atgcagttac agtaagggtg 4440 
gtttgcaggg ctagctttcc atctgacact 4500 
ttggagtctt gcacggtcag tggggcttgt 4560 
atcagaggag cggtggttgc cactgttagg 4620 
gtgtccaaac ttatgtttga ctttgttttt 4680 
tgggaggtga ggtttccggc cttgtctagg 4740 
atgccgtggg aggtgtccaa aggttcggag 4800 
tcttggaaac cattgggtga aacaaatgga 4860 
gtttctgtgt catatggata cacggggttg 4920 
atctgcaaca atatgaagat agtgggtgcg 4980 
ccatttaaac tttggagaaa gtttgcagct 5040 
gaaggaaagg aggtgatgct gaataagctg 5100 
gtaatttatt gtgtgtttat gttagttgaa 5160 
tttaataaga gtgcagaggt cctctggacc 5220 
ttctgccgag tgctgggtga cggtgatagg 5280 
gcgctgggtg ggtagcttgt agctgaggcg 5340 
tttggcctgc ttgaccacac aaaagatacc 5400 
ttcaacttct tgttttaggc gttctcgctc 5460 
ctgttcggtg ttaattccat ccaattctag 5520 
aattaaagta gctaatctca gtgggggtgg 5580 
caataacctg ttgcagtggt atgacagcgg 5640 
ctatggcatt ccagccaatg tcaaggtatg 5700 



-3- 



WO 02/20814 



PCT/US01/27682 



gatatatggc tagggcaaaa atggtactgc 
accaggcttc tgacaaatcg ctctgtttgt 
tgaatctgca ggaaatatgt cttttgggag 
gggcgcaaaa aatcagcaaa acaaaaatga 
aagatcagct atagtcctgt ctctgtattg 
acaaacccag tcaatgaact gaatgaaggc 
gataagggtg acaaatccgt aaagcaggta 
gatgtgagcg accgcggcca atgtagagca 
tataagaact cgaggaatca tgtctcattt 
tcagaccgtc caatctatga attttttcat 
agatgggggg tctggcgcgt ctgcgcttta 
aaacaaacat aagcgctatg gaaaaccacc 
aggcatgaac ttggggtaaa tttagggcag 
agtccgttgt gggcgcgatg gttgagccgg 
gtttgatcaa atttgcagtg caggcgctgg 
ttgatttgaa ggttgtgggt ataatcttgc 
ttgtccaggg gaatacaagc aagcggaaaa 
gcgtctgctt ttgtatttga gataaagtaa 
cagagcggtg gaac aa aagg tgccagtgtt 
gtactgtttg ctcatgCaca Cggtaatatc 
aaaagtttta cgattttcac cttggaagac 
ctgccaaata gcatatacag catacttgcc 
ttcatgttct gtggtgcatt ttataagagt 
aacattgcaa gctggttcct taaactcaac 
ggcgagcaag cctaaaatca tgtacctcat 
tgacaatagg tacaaacgtg cgtgcagcag 
ataagaataa acagaattac aagagtaagg 
gacaagcttg tagagttact tgaattgctc 
gcaaaatgct tttttgacct gagttccggg 
gggagtaatg tctggttacg ctcaggctgt 
cgtacgttcc cggcaggtga ggagggtggt 
gaagccgaga aggttgtgtg gcaaacttac 
aaatgaagag ccgttaaagt accaggtaag 
ggtgaggttt gctttggtct gctttgggtg 
acaggagccc cagtagattc taatttctgt 
aaagatcttg atgtaatcca gggttaggac 
cccgctcccg ctccactagc agggggcgct 
gctctacctg ggtggtgagc cggacgccgt 
attcaaagta acaaaactca ccggagccgc 
cgaggtgtgt caggcgcagt cgctctgcct 
ccgagtcttt caccgcgtca aagttgggaa 
cagaaaaggg gttgaagtaa accgaaggca 
tgcctccgga gcgcggctcc gaggacgagg 
taaatgaaga gcggccagcg ccgccgatct 
aggagctcac cgactcgtcg ttgagctgaa 
taccctgccc gggcgaccgc accctgtgac 
tagtcatctg aacttcggcc tgggcgtctc 
tttcctggta caccagggca gcgggccaac 
tggtaatagc cgcctgttcg aggagaattc 
gggatatcat gtggggtccc gcgctcatgt 
ctccagccgc aagtcccatt tgtggctggt 
tgctcataat ggcgctgacg acaggtgctg 
gttttcgcgc ttaaatttga gaaagggcgc 
atttgctgaa gagagcctcc gcgtcttcca 
gatacaggca gctgcgggtg agggagcgca 
cttggcccct gctttgttga aatatagcat 
cgcgggtcga tacgggttcg ttgggcgcca 
ccgctgtgga tttcttgggc tttgtcagag 



aaaaaaccat gacagagatg atggcgtata 5760 
tgtagcagct gggaatgttc catatttgag 5820 
gcgctgaggt ttgggagcaa agcacaggta 5880 
cactccgttt cataattaaa gaattctgag 5940 
cggatggtgc ctgaggtacg caatgcgcac 6000 
gatgactaca gtgacgaggc tgcagatgag 6060 
aactgtgaaa ggtgggatgc aatctacttc 6120 
cgcacagaaa agcgcaacaa gggtcaataa 6180 
aatcatactg taaaagaaga gaacatggtt 6240 
tgtgtgggtt gagcacaatg ataggcctat 6300 
ggcaacaaat aagccacata ataataaggc 6360 
acatgtccaa gctcgcccag tcattgacaa 6420 
atgttagtcc ggtagcagtg gtgttgcgat 6480 
tcatctctgg agcaggcaag ctgaagctgg 6540 
cagaaatcag gcgctaacgt ccaggaaagt 6600 
ccgcctggag catatcccac atagagtaaa 6660 
tcaaggcatt ttcttttcat caataaaact 6720 
ggtacatacc aaagcaagcg ctgtaataag 6780 
ctctaaacac ttttgtgggg gccacaactt 6840 
gcacatttca taaaatggaa atttatacat 6900 
tgtgacatta tagtcgttag tgtcacctgg 6960 
aattttgtct ttgtggcgaa taataagctt 7020 
agtgcattca ttagcttctg atttaaatgt 7080 
ctttttggca gcgctgcaga ctgccgcaag 7140 
cttggatgtt gcccccagcg tttaaaaagc 7200 
gcggcaaccc taaggcacag aagtgctagt 7260 
ataaccccga ccccaattcc agaaaaatta 7320 
atatacttaa ttaaaaaatc ccagcacccc 7380 
agttgagctc acctcctgtt ttggaaaaat 7440 
aggtgtgggc gcagcaaccg gtgacgcact 7500 
ggtggtggtg tttttcttga cggtgtagtt 7560 
ttcgtctcgc tggaaactgt tgtaaattac 7620 
gtacttattg gcccgcttgt gcaaaccgga 7680 
ggtaaaaacg gtggcgttca caggatggcg 7740 
atttattata ctcagcacag agatgacaac 7800 
agttgcaaac cacggtcaga acacagggac 7860 
tggtaaactc ccgaatcagg ctacgtgtaa 7920 
gcgccgggcc ctcgatatgc tcttcgggca 7980 
gggcaaagca cttgtggcgg cggcagtggt 8040 
ctccactggt cattcagtcg tagccgtccg 8100 
taaactggtc cgggtagtgg ccgggaggtc 8160 
cgaactcctc aataaattgt agagttccaa 8220 
tctgcagagt taggatcgcc tgacggggcg 8280 
gaaatgtccc gtccggacgg agaccaagag 8340 
tacctcgccc tctgattttc aggtgagtta 8400 
gaaagccgcc cgcaagctgc gcccctgagt 8460 
tgggaagtac cacagtggtg ggagcgggac 8520 
tacggggatt aaggttatta cgaggtgtgg 8580 
ggtttcggtg ggcgcggatt ccgttgaccc 8640 
agtttattcg ggttgagtag tcttgggcag 8700 
aactccacat gtagggcgtg ggaatttcct 8760 
gcgccgggtg tggccgctgg agatgacgta 8820 
gaaactagtc cttaagagtc agcgcgcagt 8880 
gcgtgcgccg aagctgatct tcgcttttgt 8940 
gagacctgtt ttttattttc agctcttgtt 9000 
acagagtggg aaaaatccta tttctaagct 9060 
gacgcagcgc tcctcctcct gctgctgccg 9120 
tcttgctatc cggtcgcctt tgcttctgtg 9180 
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tgaccgctgc tgttgctgcc gctgccgctg 
cggtagtaat gcaggatgtt acgggggaag 
cgggcgaagg agatgttgcc cccacagtct 
ccgcgccacg agcggtagcc ttggcgctgt 
ttagacttac cggccctggt tccagtggtg 
agtgccggcg gcgcctgagg agcggaggtt 
tggggcgccg gcgaggggaa tgcgaccgag 
gcctcggaag cttcgtctag gctgtcccag 
acctcctctg cctgactgtc ccagtattcc 
tgcagcttct ttttgggtgc catcctggga 
cggcggcggg gggattgggt tgagctcctc 
tccctttcgt agcagaaact cttggcgggc 
gtggccctgg gtaatgacgc aggcggtaag 
gtagaaccta atctcgtggg cgtggtagtc 
cgtccacagc cccggagtga gtttcaaccc 
accctgcagc tcaaaggtac cgataatttg 
cagggagcgg tgcggggtgc ataggttgca 
gctcacgtct tccatgatgt cggagtggta 
gcagtgaccc caaagcggcg gagggcattc 
aagcgcacag caggtggcgg gcagaattcc 
ttgcaacatg ctttgactgg tgaagtctgg 
ttcggggaag ataatgtccg ccaggtgcgc 
taggtccttc aagttttgct ttagcagctt 
gcattgctgc cacacgccca tggccgtttg 
gtcgcggacg tagtcgcggc gcgcctcgcc 
gaggcggttt tcgtgcaaaa ttccaaggta 
aattttgcag gcctggcgca cgtagccctg 
cttgcgctgc atctccgggt cagcaaagaa 
aagcactgcg gccatcatta gcttgcgtcg 
aagccagcgc gccagctgct catcgccaac 
caagtttgca tccctctcca ggggtcgtgc 
gctcataacc ttggggggta ggttaagtgc 
gcgtttcagc acggctaggc gcgcgttgtc 
actttcattt tcgctgtttt cttgttgcag 
aagaccctca aagatttttg gcacttcgtc 
ctgccgcaag gccagctgct tgtccgctcg 
cttgcagttt tggaaaaaga tgtgataggt 
gtagaagttg aggcgcgggt tgggctcgca 
gcgcggtgag aacaggtggc gttcgtaggc 
atcgctgcgc tcttgcaacg cgtcgcagat 
cagcacgtcg tctcccacat ctaggtagtc 
ttcctcgttt gcctctgcgt cgtcctggtc 
atcctcgtcg tcttcgctta caaaacctgg 
aagcgggggt gcctcgacgg ggaaggtggt 
ggtggcgaac tcaaaggggg cggttaggct 
tttctgccta taggagaagg aaatggccag 
cgagcgcgga cgcggtgcgg cgcgacgtcc 
gccgtcgccg ccgcctcccc gcgcgccccc 
cgaggacgaa gaagactcgt cacaagatgc 
atcgacctcg acggcggatt tggccattgc 
caagcccgag cgcccgccat ccccagaggt 
tgtggcgcta caaatggtgg gtttcagcaa 
aggtaagcgc acggtgcggc ggctgaatga 
gcaagaggaa aaggaagagt ccagtgaagc 
gagcctgccg atcgtgtctg cgtgggagaa 
caagtaccac gtggataacg atctaaaggc 
agctctggcg gccgtatgca agacctggct 
cttcaccagc aacaagacct ttgtgacgat 



ccgccggtgc agtaggggct gtagagatga 9240 
gccacgccgt gatggtagag aagaaagcgg 9300 
tgcaagcaag caactatggc gttcttgtgc 9360 
tgttgctctt gggctaacgg cggcggctgc 9420 
tcccatctac ggttgggtcg gcgaacaggc 9480 
gtagcgatgc tgggaacggt tgccaatttc 9540 
ggtgacggtg tttcgtctga cacctcttcg 9600 
tcttccatca tctcctcctc ctcgtccaaa 9660 
tcctcgtccg tgggtggcgg cggcggcagc 9720 
agcaagggcc cgcggctgct gatagggctg 9780 
gccggactgg gggtccaggt aaaccccccg 9840 
tttgttgatg gcttgcaatt ggccaaggat 9900 
ctccgcattt ggcgggcggg attggtcttc 9960 
ctcaggtaca aatttgcgaa ggtaagccga 10020 
cggagccgcg gacttttcgt caggcgaggg 10080 
actttcgcta agcagttgcg aattgcagac 10140 
gcgacagtga cactccagta ggccgtcacc 10200 
ggcaaggtag ttggctagct gcagaaggta 10260 
acggtactta atgggcacaa agtcgctagg 10320 
"tgaacgctct aggataaagt t^^tTaaagtt 10380 
cagaccctgt tgcagggttt taagcaggcg 10440 
ggccacggag cgctcgttga aggccgtcca 10500 
ctgcagctcc tttaggttgc gctcctccag 10560 
ccaggtgtag cacagaaata agtaaacgca 10620 
cttgagcgtg gaatgaagca cgttttgccc 10680 
ggagaccagg ttgcagagct ccacgttgga 10740 
gcgaaaggtg tagtgcaacg tttcctctag 10800 
ccgctgcatg cactcaagct ccacggtaac 10860 
ctcctccaag tcggcaggct cgcgcgtctc 10920 
tgcgggtagg ccctcctcgg tttgttcttg 10980 
acggcgcacg atcagctcgc tcatgactgt 11040 
cgggtaggca aagtgggtga cctcgatgct 11100 
accctcaagt tccaccagca ctccacagtg 11160 
agcgtttgcc gcgcgtttct cgtcgcgtcc 11220 
gagcgaggcg atatcaggta tgacagcgcc 11280 
gctgcggttg gcacggcagg ataggggtat 11340 
ggcaagcacc tctggcacgg caaatacggg 11400 
tgtgccgttt tcttggcgtt tggggggtac 11460 
aaggctgaca tccgctatgg cgaggggcac 11520 
aatggcgcac tggcgctgca gatgcttcaa 11580 
gccatgcctt tggtcccccc gcccgacttg 11640 
ttgcttttta tcctctgttg gtactgagcg 11700 
gtcctgctcg ataatcactt cctcctcctc 11760 
aggcgcgttg gcggcatcgg tggaggcggt 11820 
gtcctccttc tcgactgact ccatgatctt 11880 
tcgggaagag gagcagcgcg aaaccacccc 11940 
accaaccatg gaggacgtgt cgtccccgtc 12000 
aaaaaagcgg ctgaggcggc gtctcgagtc 12060 
gctggtgccg cgcacaccca gcccgcggcc 12120 
gtccaaaaag aaaaagaagc gcccctctcc 12180 
gatcgtggac agcgaggaag aaagagaaga 12240 
cccaccggtg ctaatcaagc acggcaaggg 12300 
agacgaccca gtggcgcggg gtatgcggac 12360 
ggaaagtgaa agcacggtga taaacccgct 12420 
gggcatggag gctgcgcgcg cgttgatgga 12480 
aaacttcaag ctactgcctg accaagtgga 12540 
aaacgaggag caccgcgggt tgcagctgac 12600 
gatggggcga ttcctgcagg cgtacctgca 12660 
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gtcgtttgca gaggtaacct acaagcacca 
ccgctgcgct gagatcgaag gcgagcttaa 
ggagcacgtg attgaaatgg atgtgacgag 
gtctagcaag gccaagatcg tgaagaaccg 
caccgacgca aggtgctgcg tgcatgacgc 
gtcttgcggc atgttcttct ctgaaggcgc 
ggctttcatg caggcgctgt atcctaacgc 
actacggtgc gagtgcaact caaagcctgg 
aaagttgact ccgttcgccc tgagcaacgc 
caagagcgtg ctggccagcg tgcaccaccc 
tgtgtatcgc aactcgcgcg cgcagggcgg 
gcccgacctg ctaaacgcgt tggtgatggt 
gctgccgcgg atggttgtgc ctgagtttaa 
gtccctgcca gtggcgcata gcgatgcgcg 
gacggcaagg gtggggggta aataatcacc 
tattgaaagt gtctcctagt acattatttt 
gcgctcctaa tctgcgcact gtggctgcgg 
tagagctgtt cctggttgcg acgcagggtg 
ttgggtaccc cggtaataag gttcatggtg 
ttggcaaagg ^gtggagaaa catgcagcag" 
tgcacgcttt gggtggactt ttccagcgtt 
ctacggcgca ggagtgactc gtactcaaac 
aagccaaagg gctcaaagag gtagcatgtt 
cagtgtacgc ccccagtctc gcgaccggcc 
ggagaaacaa agcctggaaa gcgcttgtca 
agatctttga caatggcttt cagttcctgc 
gatgttgctt gcttctttta tgttgtggcg 
acggtctcga tgacgccgcg gtgcggctgg 
aaaacataaa gaagggtggg ctcgtccatg 
tgggcggagt tggcgtagag aaggttttgg 
aagttactgg agaatgggat gcgccaaagg 
atactgtcaa ccgcggtttt gcctattagt 
ccctcgcgca tggtgggagc gaggtagcct 
attccaacct gctgatactc cttgtattta 
tggaagtttc tgaagaacga gtacatgcgg 
tggtagccaa tattgtagtt ggccaacatc 
cactgagcta cgttgtagcc ctccccgtca 
gtaagcaggc ggtcgttgcc cggccagcta 
ttaaaggtgt gattaagata gaaggttccg 
tagtaagggt cgtagcctga tcccagggaa 
gcccaaccgc gaaatgctgc ccagttgcgc 
ttggcgggta tggggtatag catgttggcg 
ttggtgtcat ttctgagcat ggcttccagc 
aaggtggcgt aaagacaaat gctgtcaaac 
tttcccagag agctctgcag aaccatgtta 
tatgagcctg gcaggaggag gaggttttta 
tgaaagggca cgtagcggcc gtttcccaac 
cggtggtggt taaagggatt aacgttgtcc 
ttaatgtagc agtctacaag cccgggagcc 
ttggggttgt cagatatttc cacattggtg 
agcgcaatat tggagtaaag gaaatttctc 
gcaaagttgt tacccactcc tatttcatta 
gtagtatctc cattatcgcc tgagccattg 
gttaccccaa tacccccaag aggaaaacaa 
ttttcaatga ttctaacatc tggatcatag 
tatctggttc tatcacctat ggaatcaagc 
tcttgcaaat ctaccacggc atttagctgc 
ccagtgctgt tataatacat taggccaata 



cgagcccacg ggctgcgcgt tgtggctgca 12720 
gtgtctacac gggagcatta tgataaataa 12780 
cgaaaacggg cagcgcgcgc tgaaggagca 12840 
gtggggccga aatgtggtgc agate tccaa 12900 
ggcctgtccg gecaatcagt tttccggcaa 12960 
aaaggctcag gtggctttta agcagatcaa 13020 
ccagaccggg cacggtcacc ttctgatgee 13080 
gcatgcaccc tttttgggaa ggcagctacc 13140 
ggaggacctg gaegeggate tgatctccga 13200 
ggegctgata gtgttccagt gctgcaaccc 13260 
aggccccaac tgegacttea agatategge 13320 
gcgcagcctg tggagtgaaa acttcaccga 13380 
gtggagcact aaacaccagt ategcaaegt 13440 
gcagaacccc tttgattttt aaaeggegea 13500 
cgagagtgta caaataaaaa catttgeett 13560 
tacatgtttt tcaagtgaca aaaagaagtg 13620 
aagtagggcg agtggcgctc caggaagctg 13680 
ggctgtacct ggggactgtt aagcatggag 13740 
gggttgtgat ccatgggagt ttggggccag 13800 
aatagtccac aggeggeega gttgggcecc -1-3 860 
atacageggt egggggaaga agcaatggcg 13920 
tggtaaacct gcttgagtcg ttggtcagaa 13980 
tttgagcgcg ggttccaggc aaaggecate 14040 
gtattgacta tggegcagge gagcttgtgt 14100 
taggtgecca aaaaatatgg cccacaacca 14160 
tcactggagc ccatggcggc agctgttgtt 14220 
ttgccggccg agaagggegt gegcaggtae 14280 
tgcacacgga ccacgtcaaa gacttcaaac 14340 
ggatccacct caaaagtcat gtctagcgcg 14400 
cccaggtctg tgagtgcgcc catggacata 14460 
gtgegatege aaagaaactt tttctgggta 14520 
gggtagggca cgttggcggg gtaagcctgt 14580 
acgaatcctg agttgttatg ctggtgaaga 14640 
gtatcgtcaa ccacttgccg gctcatgggc 14700 
tccttgtagc tttctggaat gtagaagece 14760 
tgcaccagga accagtcctt ggtcatgttg 14820 
actgagegtt taatctcaaa ctcattggga 14880 
acagaagagt caaaggtaat ggccaccttc 14940 
tcaaggtatg gtatggagcc agagtaggtg 15000 
ggggtttcct .ttgtcttcaa gcgtgtgaag 15060 
gatgggatgg agatgggcac gttggtggcg 15120 
gcggaaaggt agtcattaaa ggactggtcg 15180 
gtggaggccg tgttgtgggc catggggaag 15240 
ttaatgetag ccccgtcaac tctaagatcg 15300 
acatccttcc tgaagttcca ttcatatgta 15360 
atggcaaaaa acttttgggg cacctgaatg 15420 
aacatggagc gataaeggag gcccgcattg 15480 
atgtagtcca gagaccagcg cgccccaagg 15540 
accactcgct tgttcatgta gtcgtaggtg 15600 
gggttgtatt ttagcttgtc tggcaggtac 15660 
cataggttgg catttaggtt aatttccatg 15720 
cgtgttgcaa aagbttcatc ttttgtccat 15780 
ccattagcct taatagcttg ataggtgtca 15840 
taatttggca attcatcctc agttccatgg 15900 
ctgtctacag cctgattcca catagaaaaa 15960 
aagagttgat aggacagctc tgtgtttctg 16020 
gatgectgae cagcaagaac acccatgttg 16080 
aaattgtccc tgaaagcaat gtaattgggt 16140 
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ctgtttggca 
ttgtaagaca 
tttggtttag 
aagtcaacct 
aaaggatttg 
agcactctcc 
ggttctggtt 
gatcctattt 
acatgtgttt 
tcttcatctt 
cactcacagg 
ggcttaaaag 
tcaagcacac 
tcctcgcggt 
tcgtgcgtag 
gtggcgcggg 
atgtgcatgt 
acggcggctc 
ttagctattt 
ttcagtgtgc 
gcacggcgca 
acaggtttct 
cccagcactc 
gcctcctcgt 
cgcggacgct 
tctacgggag 
aggtccacca 
tgctctttca 
tagttcttaa 
gcgccaccgc 
ccgtgtcgcg 
accaagcgag 
taatcagtgt 
cgcggcgatc 
gcgccgccgc 
cctcctacgg 
ggccatatct 
tgttggcagc 
cagcacgggc 
agtcgctcgt 
aggatgtagg 
cgggcgccgg 
tttggacgcg 
ggacgtttgt 
tatctgaacg 
gcggacctcg 
ctggcgcttg 
gccgcgcttt 
taggctaggt 
tccgtaggca 
cacctcatca 
ggtggtgccg 
gggcgcggtt 
ctttttcttt 
ggggccatag 
catagcttcg 
ttttcttgca 
cgagtcgcgc 



tagattgttg 
gatgtgtgtc 
tagcattgcc 
ttggaagagg 
taggcctggc 
ctcctgccgc 
gataggaagg 
gtagcccgct 
tcttagtagc 
cctcttcttc 
agttaggagc 
taggccccct 
ggttgtcacc 
ccacagggat 
gtgccaccgt 
caaactgcac 
aagaccactg 
agcagctcct 
agaagcatcg 
ttrtgccagtt 
gggacgcgcg 
gctgggtgtc 
cggtagccat 
acgagggagg 
tttcgccacg 
ggcggggatc 
ccccgctaat 
acttgtccct 
tggtggaacc 
tcatattgct 
gggccagaga 
cgtgagactc 
ctctgcgcct 
agtggaataa 
cggtggtgcg 
tgcattcttc 
gcaagaacca 
accagggtcc 
tggcgacggc 
cttctggggc 
gcatattcgg 
ggggctgaaa 
gccgcagcgg 
gtctccatgc 
tccacggtct 
ggctccagcc 
ggtaccatca 
tcttcggacg 
gttgggttgc 
aactccccga 
tacacgcgct 
ggtcgcaaaa 
gcgtgcagca 
ttgacccgct 
atctccggcg 
ttgcgcgccg 
atctagttgc 
acccagtaca 



acccaacata 
tggggtttcc 
ttgccggtcg 
cacccctttt 
ataagatcca 
attagcatca 
atctgcgtat 
ttttgtaatt 
ctgatctcga 
atcctcggca 
gcccttggga 
gtccagcacg 
cacagccagg 
gaaccgcagc 
ggggtttcta 
cagcccgggg 
cggcatcatc 
ctggcggcga 
tcggcgcttc 
"gccactggct 
gctagggcgg 
agcgggggga 
gggcgcgatg 
ctcatctatt 
cccctctgga 
aagcttactg 
gccagaggcc 
cagcatctgg 
gaaattttta 
ggtgccgata 
cgcaaagttg 
cagacttttt 
gcaaggccac 
ggaggggcag 
cacgacgcat 
ctcggaatcc 
caaagaccgg 
tgcctccttc 
gacggcggcg 
ggtaggtgta 
gcagtagtgc 
cgcgaaacat 
ccgcctgcac 
cctctgtggc 
gcacgcccag 
caggctccac 
gctgcacggt 
gtgcaagcgt 
cctcgtccag 
ggcgctcgtt 
tgtaggtgcg 
cacgtcttac 
gttccacctc 
ttagctttcg 
cgatgacctg 
ccgccgctgg 
gcggggggcg 
cgttgcccct 



gctttagaat 
atatttacat 
ttcaaagagg 
tcatccggaa 
tagcatggtt 
gcttcgttcc 
acaggtttag 
gtttctccag 
gcgttttgct 
actgcccggc 
gctagagcgt 
ccgcggatgt 
gtgaaccgcg 
gtcaaacgct 
aacttgttat 
ctcaggtact 
gaaggggtag 
catggacgca 
agggattgca 
acgggccgca 
gttacaacaa 
ggcaggtcca 
ggacgggtgg 
tgcgtcacca 
gacactgtct 
ttaatcttat 
aggccatcta 
cctgtgctgc 
atgccgctcc 
tcttgccagt 
atgtcttcca 
attttgattt 
ggatgcaatt 
gataccgccg 
gccgcccgtc 
cggcaccggg 
cttttaaacg 
gcgagccacc 
gcgggttcca 
gccacgatag 
gctggcggtg 
ccacgggtcc 
cgcggcatct 
agtggcaata 
tcccggtgcc 
ggtcattttt 
gggtgccaag 
gggcagcacc 
cggcaacgcc 
ggcctgctca 
ggtggagcgc 
gcgtcgacct 
gtcgtcaagt 
gggcttgtaa 
gagcatctct 
atacatacaa 
ggtgcgcacg 
gcgaccctga 



tttcatcacc 
cttcactgta 
tagtatttga 
ccagaacgga 
tcatgggagt 
actgagattc 
cttgtgtttc 
acaaaggagc 
cttcttcttc 
cgctatcttc 
tgtaggcagt 
caaagtacgt 
ctttgtacga 
gggaccggtc 
tcaggctgaa 
ccgaggcgtc 
ccatcttgga 
tacatgacac 
cccccagacc 
bcgatcgcgg 
cggcggacgg 
gcgttacagg 
tgggcaggcc 
gagtttcttc 
ccacggccgg 
tttgcactgc 
ccaccttttg 
tgttccaggc 
acagcgagcc 
ttcccatgaa 
ttctacaaaa 
ttccacatgc 
ccgggcacgg 
cgcatgcgac 
aggccgtggc 
aaacggaggc 
atgctggggt 
ctgcgcacgg 
gtggtggttc 
ccgggggtag 
ccgtacttcc 
gtttgcacct 
gccaccgccg 
ctagtgctac 
acctgcttga 
tccaagacat 
tcaccagact 
tgctgcagtg 
aacatgtcct 
agcaggtcct 
tcaccgggcg 
ttccactgta 
tcatcatcat 
tcctgctctt 
tctttgattt 
cagtacgagt 
ggcacgcgca 
gtcatagcac 



ttttccaggt 
caaaaccact 
gaagaattgc 
ttgaccacca 
tgttttttta 
gccaatttga 
tgcattgtct 
ctgggcatag 
ctcttcttca 
ggtttgttcc 
gccggagtag 
ggaagccata 
gtacgcggta 
tgtggttacg 
gtacgtctcg 
ctggcccgag 
aagcgggcgc 
atacgacacg 
cacgatgctg 

acCyCtyy Cy 

ccctggcagc 
tgtgtgctgg 
ttgctttagt 
cctgtcgggc 
tggaggctcc 
ctggttggcc 
ttggaaattt 
cttgctgcca 
ccagctgaag 
cgggcgcgag 
tagttacagg 
aacttgtttt 
cgccaatcgc 
ggtgcgacgc 
cggccatgcc 
ggcaggtgag 
ggtagcgcgc 
aaatcggggc 
ggcgtcgggt 
gcgcgatgga 
tggaacggcg 
ccgtagaggt 
aggcaaccgg 
tggtggtggg 
ttggccgcac 
cttccagtcg 
cgcgctttag 
tcacgggctt 
tatgccgctt 
cgtcgccgta 
taaaaactac 
cccgccgcct 
catcatcttt 
ccttcttcgg 
tgcgcttgga 
ctaagtagtt 
ggccgctaac 
taatggccgc 



16200 
16260 
16320 
16380 
16440 
16500 
16560 
16620 
16680 
16740 
16800 
16860 
16920 
16980 
17040 
17100 
17160 
17220 
17280 



17460 
17520 
17580 
17640 
17700 
17760 
17820 
17880 
17940 
18000 
18060 
18120 
18180 
18240 
18300 
18360 
18420 
18480 
18540 
18600 
18660 
18720 
18780 
18840 
18900 
18960 
19020 
19080 
19140 
19200 
19260 
19320 
19380 
19440 
19500 
19560 
19620 
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ggctgctgcg gcggccgctc gtcgcctgga 
cagccttcga , gcggcccgca tggccgcccg 
ggccgccgcc gcgcgttggg cggcagtgcc 
cctccgccgt ctcttcattt tagcataacg 
cgcgtccact gtggacactg gtggcggcgt 
cgcgtcaatg gcgtcatcga cggtggtgcg 
gggcgcgcgg tagtgcccgc gcacgcgcac 
gccaaacatc ttgcttggga agcgcaggcc 
gatggacatg tttgctcaaa aagtgcggct 
ggccttgtaa acgtaggggc aggtgcggcg 
tcctccgatg ctgttgcgca gcggtagcgt 
actgacggtg gtgatggtgg gggctggcgg 
attgaacacg tgggtcagag aggtaaactg 
gttgtagaag ctcttggagt gcacgggcaa 
gatctggctc gtggagcgga aggtcacggg 
gacctgctcc gagccgcagg ttacgtcagg 
ggtctgaggg tcgccgtagt tgtatgcaag 
gtcattgctt attaggttgt aactgcgttt 
cggtttcttc tgaggcttct cgacctcggg 
tgcctcggcc ~teagcgc*yct ^tctcctccgc 
atgatcgttc atgtcctcca ccggctgcat 
cgcgccgctg ccactgttgt tgccgccgcc 
ttttaagctt gcctggtagg cgtccacatc 
gtcatcgtag gtgatcctaa agccctcctg 
gttgctcagg cggctgtggg tgaagtccac 
atggaaggct tcgtttgtat ataccccagg 
cagtctgaag ttgcgggtgt caaactttac 
cctgcccact ttcaagtagt gctccacgat 
ctcggagtag ttgccctcgg gcagcgtgaa 
tttgtcctta gtaagcgagc gcgacaccat 
gaactcgttc acatttggca tgttggtatg 
cgaacggtcg tcaagattga tggtctgtgt 
ttgaatgacc gtggttagaa agttgctgtg 
cgttgacttg ttgtccacaa ggtacacacg 
gtaacggatg ctgtttctcc ccccggtagg 
gtccagggga gcatcgaagg gggaacccag 
gctctcgtag gagggaggag gaccttcctc 
aatacaagaa aaccaacgct cggtgccatg 
cttttttttt ttttttttaa aacattctcc 
tgccactccc tcccaaatcc aggacgctgc 
ccagaccccg ctgacggtcg tgcctttgac 
ccctgtgctc ctgcgcatac gtcttccatc 
cgttgttggg aaatgccgga ggcaggttct 
ttaggtactc ctcctcgccc agcaggcgcg 
ctatcaagct tggaaatggg ctactcgcat 
acaagctgct tggcctgcgg aagctttcct 
gttgcaactc tagcagggtc tgcggttgcg 
agaggaatcc atcgttaccc tcgggcacct 
gtagccagtg cgggttcaag atggcattgg 
gatgcaagta gtccattagg cgattgataa 
ccatgttgcg cgcggtcatg tccagcgcca 
taaggctcac gctctgctgc acatagcgca 
gcaacgaggg gatcttctgc cgccggttgg 
tgcccgtgtc ctcctgcccc agcgcgcggc 
cgtccacatg cgcctgacct atggcctcgc 
tgtcccggga cacgctgcca ctgtccgtga 
agttgggcgt cagcaagcta gacacggtcg 
acagcccctg caagttcttg aaagcctggc 



cctggggggc acagtgacaa tacccgcggc 19680 
tcggccggtg cgacgtgcgc ggttaagcag 19740 
gggtcggcgg cggtggcgac gtgctacgcg 19800 
ccgggctccg cgcaccacgg tctgaatggc 19860 
gggcgtgtag ttgcgcgcct cctccaccac 19920 
cccagtgcgg ccgcgtttgt gcgcgcccca 19980 
tgggtgttgg tcggagcgct tctttgcccc 20040 
ccagcctgtg ttattgctgg gcgatataag 20100 
cgataggacg cgcggcgaga ctatgcccag 20160 
tctggcgtca gtaatggtca ctcgctggac 20220 
cccgtgatct gtgagagcag gaacgttttc 20280 
gcgcgccaaa atctggttct cgggaaagcg 20340 
gcggatgagc tgggagtaga cggcctggtc 20400 
cagctcggcg cccaccaccg gaaagttgct 20460 
gtcttgcatc atgtctggca acgaccagta 20520 
agtgcaaagg agggtccatg agcggatccc 20580 
gtaccagctg cggtactggg tgaaggtgct 20640 
cttgctgtcc tctgtcaggg gtttgatcac 20700 
ttgcgcagcg ggggcggcag cttctgccgc 20760 

CCy Ly tyy Ca aayy fcyfc Cy C~CgCga£ fcy y C ~2 0 82 G~ 

tgccgcggct gccgcgttgg agttctcttc 20880 
tgcgccatcc ccgccctgtt cggtgtcatc 20940 
caacagtgcg ggaatgttac caccctccag 21000 
gaagggttgc cgcttgcgga tgcccaacaa 21060 
cccgcatcct ggcagcaaaa tgatgtctgg 21120 
catgacaaga ccagtgactg ggtcaaaccc 21180 
cccgatgtcg ctttccagaa ccccgttctg 21240 
cgcgttgttc ataaggtcta tggtcatggt 21300 
ctccacccac tcatatttca gctccacctg 21360 
cacccgcgcc ttaaacttat tggtaaacat 21420 
caggatggtt ttcaggtcgc cgccccagtg 21480 
gcttgcctcc cccgggctgt agtcattgtt 21540 
gtcgttctgg tagttcaggg atgccacatc 21600 
ggtggtgtcg aataggggtg ccaactcaga 21660 
ccgcaggtac cgcggaggca caaacggcgg 21720 
cgccgccgcc actggcgccg cgctcaccac 21780 
atacatcgcc gcgcgctgca tactaagggg 21840 
gccttggtga gttttttatt ttgcatcatg 21900 
ccagcctggg gcgaaggtgc gcaaacgggt 21960 
tgtcgtctgc cgagtcatcg tcctcccaca 22020 
gacgggtggg cgggcgcggg ccgggcacat 22080 
tactcatctt gtccactagg ctctctatcc 22140 
tttcgcgctg cggctgcagc agcgagttgt 22200 
ggcgggtggt gcgagtgctg gtaaaagacc 22260 
ctgaccgcgg ggccgcagcg cctagatcgg 22320 
ttcgcagcgc cgcctctgcc tgctcgcgct 22380 
gggaaaacac gctgtcgtct atgtcgtccc 22440 
caaatccccc ggtgtagaaa ccagggggcg 22500 
tgaaatactc ggggttcacg gcggccgcgc 22560 
acggccggtt tgaggcatac atgcccggtt 22620 
cgctgggcgt taccccgtcg cgcatcaggt 22680 
agatgcgctc ctcctcgctg tttaaactgt 22740 
tcagcaggta gttcagggtt gcctccaggc 22800 
tgacacttgt aatctcctgg aaagtatgct 22860 
ggtacagtgt cagcaagtga cctaggtatg 22920 
agggcgctat tagcagcagc aacaggcgcg 22980 
cgcggtcgcc tgtgggagcc cgcacccccc 23040 
tcaggtttac ggtctgcagg ccttgtctac 23100 
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tggtctggaa aaaatagtct ggcccggact 
ccattagccg cagtgcgctc acaaagttgg 
cgggctgtgt actcaggaag gcgtttagtg 
gcgcgcgctc acgctgcgcc acggcctcgc 
tctgcacgtt gccgctgttg taacgagcca 
cggcctcatc gggccggatg gccctgtttt 
tctcgtgcgt ggggtttgcg cgcgccggga 
tggcctgcgg ctgctgccgg aacgcgtcag 
ccatgacctg gcgccagtcg tccgtggagt 
cccgcaccgc cgggtccgtt gcgtcttgca 
ctcgccgtcc tctggctcgt actcatcgtc 
gccagcgcgc gcgggtgcca ccgccagccc 
tcggcttggg gcccagcgca ggtcagcgcc 
gccgctgccc gtgccagcca gggccctttg 
ctcgcgccgc cggctcacgc tcacggcctt 
gtcgtcgctc aaggtaagca ccttcaacgc 
ctccttgtct atgggaacgt aaggggtatg 
actgagcatg gaatagttaa tggcggccac 
caccactatg ctctgcagaa tgtttatcaa 
tatgtteagc agcgcatccc tgaatgcctc 
cagctgcgcc atgagcggct tgctatttgg 
cagatgcatc agtcctatag ccacctcctc 
aaagcttttt tgaaagttaa tctcctggtt 
ggcggccgcc acgtgtgcgc gcgcgggact 
ctcctcgcgc agcaaccgct cgcggttcag 
ctttcgatcc cgcatctcct cgggctcctc 
cggcacgtac gcctcgcgcg tgtcacgctt 
gggcgctcct agccgcgcca ggccctcgcc 
gcgccgcggg ggttcgtaat caccatctgc 
tgacgcggta ggagaagggg agggtgccct 
ctgctgagga ggggggcgca tctgccgcag 
gggctcgtcc ctgtttccgg aggaatttgc 
cccccgttcg ccgcagtccg gccggcccga 
cttggaaaat aaccctccgg ctacagggag 
taaccgctta cgccgcgcgc ggccagtggc 
tggaaggaag ccaaaaggag cgctcccccg 
cgcgggcggt aaccgcatgg atcacggcgg 
gtccgccatg atacccttgc gaatttatcc 
gctctccttt tgcacggtct agagcgtcaa 
cccgaccatg gagcactttt tgccgctgcg 
gcgcgcctcc accaccgccg ccggcatcac 
tcgccttatg ttggaagacc tcgcccccgg 
ccgccagccg ccgccgcact ttttggtggg 
ctacgtcttt gactcaaggg cttactcgcg 
tcaccagacc gttaactggt ccgttatggc 
ataccaccgc tttgtggaca tggatgactt 
catattagcc gagcgcgttg tcgccgacct 
ggtcacacgc atgggaggaa gagggcgcca 
gatagatgca agagatgcag gacaagagga 
catgcaagac tactacaaag acctgcgccg 
ccgcc tgcgc attcagcagg ccggacccaa 
tctcaagacc gcctacttta attacatcat 
ccgccacccg ctgccgcccg ccacggtgct 
cgcctttctc gagaggtttt ccgatccggt 
cggagtacct acacaacaat tgttgagatg 
cagccccccg ccaacccata accgggacat 
cgagaacggc cgcgccgtca ccgagaccat 
ctttgtcgac cgcctcccgg tgcgccgtcg 



ggtacacctc actttgcggt gtctcagtca 23160 
tgtagtcctc ctgtccccgc ggcacgttgg 23220 
caaccatgga gcccaggttg ccctgctgct 23280 
gcacatcccc caccagccgg tccaggttgg 23340 
cgcgctgaag cagcgcgtcg tagaccaggc 23400 
cggccagcgc gtttacgatc gccagcacct 23460 
ccaccgcttc cagaattgcg gagagccggt 23520 
ggttacgcgc agtcagcgac atgatgcggt 23580 
taaggccgga cggctggctc tgcagcgccg 23640 
tcatctgatc agaaacatca ccgcttagta 23700 
ctcgtcatat tcctccacgc cgccgacgtt 23760 
aggtccggcc ccagctgcct ccagggcgcg 23820 
cgcgtcaaag taggactcgg cctctctatc 23880 
caggctgtgc atcagctcgc ggtcgctgag 23940 
gtgga tgcgc tcgttgcgat aaacgcccag 24000 
catgcgcatg tagaacccct cgatctttac 24060 
gtatatcttg cgggcgtaaa acttgcccag 24120 
cttgtcagcc aggctcaagc tgcgctcctg 24180 
atcgagcagc cagcggccct cgggctctac 24240 
gttgtccctg ctgtgcirgca ctataaggaa -243-00' 
gttttgctcc agcgcgctta caaagtccca 24360 
gcgcgccaca agcgtgcgca cgtggttgtt 24420 
caccgtctgc tcgtacgcgg ttaccaggtc 24480 
aatcccggtc cgcgcgtcgg gctcaaagtc 24540 
gccatgccgc aactcgcgcc ctgcgtggaa 24600 
tccctcgcgg tcgcgaaaca ggttctgccg 24660 
cagctgcacc cttgggtgtc gctcaggaga 24720 
ctcctccaag tccaggtagt gccgggcccg 24780 
cgccgcgtca gccgcggatg ttgcccctcc 24840 
gcatgtctgc cgctgctctt gctcttgccg 24900 
caccggatgc atctgggaaa agcaaaaaag 24960 
aagcggggtc ttgcatgacg gggaggcaaa 25020 
gactcgaacc gggggtcctg cgactcaacc 25080 
cgagccactt aatgctttcg ctttccagcc 25140 
caaaaaagct agcgcagcag ccgccgcgcc 25200 
ttgtctgacg tcgcacacct gggttcgaca 25260 
acggccggat ccggggttcg aaccccggtc 25320 
accagaccac ggaagagtgc ccgcttacag 25380 
cgactgcgca cgcctcaccg gccagagcgt 25440 
caacatctgg aaccgcgtcc gcgactttcc 25500 
ctggatgtcc aggtacatct acggatatca 25560 
agccccggcc accctacgct ggcccctcta 25620 
atatcagtac ctggtigcgga cttgcaacga 25680 
tctcaggtac accgagctct cgcagccggg 25740 
caactgcact tacaccatca acacgggcgc 25800 
ccagtctacc ctcacgcagg tgcagcaggc 25860 
ggccctgctt cagccgatga ggggcttcgg 25920 
cctacggcca aactccgccg ccgccgtagc 25980 
aggagaagaa gaagtgccgg tagaaaggct 26040 
atgtcaaaac gaagcctggg gcatggccga 26100 
ggacatggtg cttctgtcga ccatccgccg 26160 
cagcagcacc tccgccagaa acaaccccga 26220 
cagcctacct tgcgactgtg actggttaga 26280 
cgatgcggac tcgctcaggt ccctcggtgg 26340 
catcgttagc gccgtatccc tgccgcacgg 26400 
gacgggcggc gtcttccaac tgcgcccccg 26460 
gcgccgtcgc cgcggggaga tgatcgagcg 26520 
tcgccgccgt gtcccccctc ccccaccgcc 26580 
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gccagaagaa gaagaagaag gggaggccct tatggaagag gagattgaag aagaagaggc 26640 
ccctgtagcc tttgagcgcg aggtgcgcga cactgtcgcc gage tea tec gtcttctgga 26700 
ggaggagtta accgtgtcgg cgcgcaactc ccagtttttc aacttcgccg tggacttcta 26760 
egaggecatg gagegecttg aggccttggg ggatatcaac gaatccacgt tgcgacgctg 26820 
ggttatgtac ttcttcgtgg cagaacacac cgccaccacc ctcaactacc tetttcageg 26880 
cctgcgaaac tacgccgtct tcgcccggca cgtggagctc aatctcgcgc aggtggtcat 26940 
gcgcgcccgc gatgecgaag ggggcgtggt ctacagccgc gtctggaacg agggaggect 27000 
caacgccttc tcgcagctca tggcccgcat ctccaacgac ctcgccgcca ccgtggagcg 27060 
agccggacgc ggagatctcc aggaggaaga gatcgagcag ttcatggccg aaatcgecta 27120 
tcaagacaac tcaggagacg tgcaggagat tttgcgccag gccgccgtca acgacaccga 27180 
aattgattct gtcgaactct ctttcaggtt caagctcacc gggcccgtcg tcttcacgca 27240 
gaggegecag attcaggaga tcaaccgccg cgtcgtcgcg ttcgccagca acctccgcgc 27300 
gcagcaccag ctcctgcccg cgcgcggcgc cgacgtgccc ctgccccctc tcccggcggg 27360 
tcccgagccc cccctacctc cgggggcccg cccgcgtcac cgcttttaga tgcatcatcc 27420 
aaggacaccc ccgcggccca ccgcccgccg cgcggtaccg tagtcgcgcc gcggggatgc 27480 
ggcctcttgc aagtcatcga cgccgccacc aaccagcccc tggaaatcag gtatcacctg 27540 
gacctagccc gcgccctgac ccggctatgc gaggtaaacc tgeaggaget cccgcctgac 27600 
ctgtcgccgc gggagctcca gaccatggac agctcccatc tgcgcgatgt tgtcatcaag 27660 
ctccgaccgc cgcgcgcgga catctggact ttgggctcgc gcggcgtggt ggtccgatcc 27720 
accataactc ccc t cgagca" gc cagaegg t caaggacaag- cag ccgaag t agaagac cac 27-78 0 
cagccaaacc cgccaggcga ggggctcaaa ttcccactct gcttccttgt gcgcggtcgt 27840 
caggtcaacc tcgtgcagga tgtacagccc gtgcaccgct gecagtactg cgcacgtttt 27900 
tacaaaagee agcacgagtg ttcggcccgt cgcagggact tctactttca ccacatcaac 27960 
agccactcct ccaactggtg gegggagate cagttcttcc cgatcggctc gcatcctcgc 28020 
accgagcgtc tctttgtcac ctacgatgta gagacctata cttggatggg ggcctttggg 28080 
aagcagctcg tgcccttcat gctggttatg aagttcggcg gagatgagee tctggtgacc 28140 
gccgcgcgag acctagccgt ggaccttgga tgggaccget gggaacaaga cccgcttacc 28200 
ttctactgea tcaccccaga aaaaatggcc ataggtcgee agtttaggac ctttcgcgac 28260 
cacctgcaaa tgctaatggc ccgtgacctg tggagctcat tcgtcgcttc caaccctcat 28320 
ettgeagact gggccctgtc agaacaeggg ctcagctccc ctgaggagct cacctacgag 28380 
gaacttaaaa aattgccctc catcaagggc accccgcgct tcttggaact ttacatcgtg 28440 
ggccacaaca teaaeggett cgacgagatc gtgctcgccg cccaggtaat taacaaccgt 28500 
tccgaggtgc cgggaccctt ccgcatcaca cgcaacttta tgcctcgcgc gggaaagata 28560 
cttttcaacg atgtcacctt cgccctgcca aacccgcgtt ecaaaaageg caeggacttt 28620 
ttgctctggg ageagggegg atgegacgae actgacttca aataccagta cctcaaagtc 28680 
atggttaggg acacctttgc gctcacccac acctcgctcc ggaaggcege gcaggcatac 28740 
gcgctacccg tagaaaaggg atgctgcgcc taccaggccg tcaaccagtt etacatgeta 28800 
ggctcttacc gtteggagge cgacgggttt ccgatccaag agtactggaa agaccgegaa 28860 
gagtttgtcc tcaaccgcga gctgtggaaa aaaaagggac aggataagta tgacatcatc 28920 
aaggaaaccc tggactactg cgccctagac gtgeaggtea ccgccgagct ggtcaacaag 28980 
ctgcgcgact cctacgcctc cttcgtgcgt gaegeggtag gtctcacaga cgccagcttc 29040 
aacgtcttcc agcgtccaac catatcatcc aactcacatg ccatcttcag gcagatagtc 29100 
ttccgagcag agcagcccgc ccgtagcaac ctcggtcccg acctcctcgc tccctcgcac 29160 
gaactatacg attacgtgcg cgccagcatc cgcggtggaa gatgctaccc tacatatctt 29220 
ggaatactca gagagcccct etaegtttae gaeatttgeg geatgtaege ctccgcgctc 29280 
acccacccca tgccatgggg tcccccactc aacccatacg agcgcgcgct tgccgcccgc 29340 
gcatggcagc aggegctaga ettgeaagga tgcaagatag actacttcga cgcgcgcctg 29400 
ctgcccgggg tetttacegt ggaegcagae cccccggacg agaegcaget agacccacta 29460 
ccgccattct gttcgcgcaa gggcggccgc ctctgctgga ccaacgagcg cctacgcgga 29520 
gaggtageca ccagcgttga ccttgtcacc ctgcacaacc gcggttggcg cgtgcacctg 29580 
gtgcccgacg agcgcaccac cgtctttccc gaatggcggt gcgttgcgcg egaataegtg 29640 
cagctaaaca tcgcggccaa ggagcgcgcc gatcgegaca aaaaccaaac cctgcgctcc 29700 
ategecaagt tgctgtccaa cgccctctac gggtcgtttg ccaccaagct tgacaacaaa 29760 
aagattgtct tttctgacca gatggacgcg gccaccctca aaggcatcac cgcgggccag 29820 
gtgaatatca aatcctcctc gtttttggaa actgacaatc ttagegcaga agtcatgccc 29880 
gcttttgaga gggagtactc accccaacag ctggccctcg cagacagega tgeggaagag 29940 
agtgaggacg aacgcgcccc cacccccttt tatagccccc cttcaggaac acccggtcac 30000 
gtggcctaca cctataaacc aatcaccttc ettgatgecg aagagggega catgtgtctt 30060 



-10- 



WO 02/20814 



PCT/US01/27682 



cacaccctgg agcgagtgga ccccctagtg 
tccttcgtgc tggcctggac gcgagccttc 
gaggaccgcg gaacaccgct cgaggacagg 
agccttttcg tcaccgagcg tggacaccgg 
aaaaagcatg ggggaaacct ggtttttgac 
gaatgcgaga ccgtctgcgg ggcctgcggc 
ctcgcgccca agctctacgc ccttaaaagt 
aagggcaagc tgcgcgccaa gggccacgcc 
aaatgctacc tggccgacgc gcagggcgaa 
agcctcaagc gcaccctggc cagcgcgcag 
actacgctga cgaggaccct gcgcccgtgg 
caccgactac tgccgtacag cgaaagccgc 
atcgagatgc cgtagagcac gtgaccgagc 
cgctcaaaag catgcctacg gcggacggcc 
aagaactgct atcgctgggc ggcgagcgcc 
aagtcaggga catgcttaac gaagtggccc 
ctcttaacta ccagttgcag ccggtaatag 
agtcgcagct gctcaggaac ctgctttctt 
ttttcttcat cgccccgcag gtagacatga 
tgcaaatctg tgagggtaac taegcccctg 
gcaccctccg cccgcgcttt gtaaaaatgg 
atgacgttag tgatcccaga aatatcttcg 
tcattatgga cgaatgcatg gaaaatctcg 
acgcatttcc ttctaagcta catgacaaat 
tggttctgca caacatgaat ccccggaggg 
tacagtccaa gatgcatctc atatccccac 
taaacactta caccaagggc ctgcccctgg 
ggcaccacgc ccagcgctcc tgctacgact 
aagctctgca gtggtgctac ctccacccca 
tccagagtca cctttaccac gtcctggaaa 
gctggtcccg ggcctaccgc gcgcgcaaaa 
tgatcaaaat ccaaacagag tctggttttt 
ggaagccttc agggcagaaa cctgctggcg 
aagttcccgg gtcaaagaat ccaattgtgc 
ggatgaacgg gaagctgcac tgcttgcaag 
cccgcgggcg gtggctgcag cggctgaagc 
tccagacacg gtctcgtagg tcaaggtagt 
aatgctggag cccatcacat tctgacgcac 
atatgagctc acaatgcttc catcaaacga 
acagatacaa aactacatga gacccccacc 
cccatcgatg gcaaacagct attatgggta 
gtattcagtg tcgctgattt gtattgtctg 
caattaatac gatacctgcg tcataattga 
acgttgtgat atgtagatga taatcattat 
aggttacggg gcggcgacct cgcgggtttt 
cgtttccgtt cttcttcgtc ataacttaat 
aggaaacgac aggtgctgaa agcgaggctt 
ttgtccgtgg aatgaacaat ggaagttaac 
aaggccagga accgtaaaaa ggccgcgttg 
gacgagcatc acaaaaatca acgctcaagt 
agataccagg cgtttccccc tggaagctcc 
cttaccggat acctgtccgc ctttctccct 
cgctgtaggt atctcagttc ggtgtaggtc 
ccccccgttc agcccgaccg ctgcgcctta 
gtaagacacg acttatcgcc actggcagca 
tatgtaggcg gtgctacaga gttcttgaag 
acagtatttg gtatctgcgc tctgccaaag 
tcttgatccg gcaaacaaac caccgctggt 



gacaacgacc gctacccctc ccacttagcc 30120 
gtctcagagt ggtccgagtt tctatacgag 30180 
cctctcaagt ctgtatacgg ggacacggac 30240 
ctcatggaaa ccagaggtaa gaaacgcatc 30300 
cccgaacggc cagagctcac ctggctcgtg 30360 
gcggatgcct actccccgga atcggtattt 30420 
ctgcactgcc cctcgtgcgg cgcctcctcc 30480 
gcggaggggc tggactatga caccatggtc 30540 
gaccggcagc gcttcagcac cagcaggacc 30600 
cccggagcgc accccttcac cgtgacccag 30660 
aaagacatga ccctggcccg tctggacgag 30720 
cccaacccgc gaaacgagga gatatgctgg 30780 
tgtgggaccg cctggaactg cttggtcaaa 30840 
tcaaaccgtt gaaaaacttt gcttccttgc 30900 
ttctggcgca tttggtcagg gaaaacatgc 30960 
ccctgctcag ggatgacggc agctgcagct 31020 
gtgtgattta cgggcccacc ggctgcggta 31080 
cccagctgat ctcccctacc ccggaaacgg 31140 
tccccccatc tgaactcaaa gcgtgggaaa 31200 
ggccgga tgg aaccattata ccgca.gt.ctg 31260 
cctatgacga tctcatcctg gaacacaact 31320 
cccaggccgc cgcccgtggg cccattgcca 31380 
gaggtcacaa gggcgtctcc aagttcttcc 31440 
ttcccaagtg caccggatac actgtgctgg 31500 
atatggctgg gaacatagcc aacctaaaaa 31560 
gtatgcaccc atcccagctt aaccgctttg 31620 
caatcagctt gctactgaaa gacattttta 31680 
ggatcatcta caacaccacc ccgcagcatg 31740 
gagacgggct tatgcccatg tatctgaaca 31800 
aaatacacag gaccctcaac gaccgagacc 31860 
cccctaaata aagacagcaa gacacttgct 31920 
atttatgttt taaaccgcat tgggagggga 31980 
cagatccaac agctgctgag aaacgacatt 32040 
caaaagagcc gtcaacttgt catcgcgggc 32100 
cgggctcagg aaagcaaagt cagtcacaat 32160 
ggcggcggag gctgcagtct ccaacggcgt 32220 
agagtttgcg ggcaggacgg ggcgaccatc 32280 
cccggcccat gggggcatgc gcgttgtcaa 32340 
gttggcgctc atggcggcgg ctgctgcaaa 32400 
ttatatattc tttcccaccc ttaagccccg 32460 
ttatgggtgc tagcgacatg aggttgcccc 32520 
aagttgtttt tacgttaagt tgatgcagat 32580 
ttatttgacg tggtttgatg gcctccacgc 32640 
cactttacgg gtcctttccg gtgatccgac 32700 
cgctatttat gaaaattttc cggtttaagg 32760 
gtttttattt aaaataccct ctgaaaagaa 32820 
tttggcctct gtcgtttcct ttctctgttt 32880 
ggatccaggc cgcgagcaaa aggccagcaa 32940 
ctggcgtttt tccataggct ccgcccccct 33000 
cagaggtggc gaaacccgac aggactataa 33060 
ctcgtgcgct ctcctgttcc gaccctgccg 33120 
tcgggaagcg tggcgctttc tcatagctca 33180 
gttcgctcca agctgggctg tgtgcacgaa 33240 
tccggtaact atcgtcttga gtccaacccg 33300 
gccactggta acaggattag cagagcgagg 33360 
tggtggccta actacggcta cactagaaga 33420 
ccagttacct tcggaaaaag agttggtagc 33480 
agcggtggtt tttttgtttg caagcagcag 33540 
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attacgcgca gaaaaaaagg atctcaagaa gatcctttga tcttttctac ggggtctgac 33600 
gctcagtgga acgaaaactc acgttaaggg attttggtca tcagattatc aaaaaggatc 33660 
ttcacctaga tccttttaaa ttaaaaatga agttttaaat caatctaaag tatatatgag 33720 
taaacttggt ctgacagtta ccaatgctta atcagtgagg cacctatctc agcgatctgt 33780 
ctatttcgtt catccatagt tgcctgactc cccgtagtgt agataactac gatacgggag 33840 
ggcttaccat ccggccccag tgctgcaatg ataccgcgtg acccacgctc accggctcct 33900 
gatttatcag caataaacca gccagccgga agtgccgagc gcagaagtgg tcctgcaact 33960 
ttatccgcct ccatccagtc tattagttgt tgccgggaag ctagagtaag tagttcgcca 34020 
gttaatagtt ttcgcaacgt tgttgccatt gctacaggca tcgtggtgtc acgctcgtcg 34080 
tttggtatgg cttcattcag ctccggttcc caacgatcaa ggcgagttac atgatccccc 34140 
atgttgtgca aaaaagcggt tagctccttc ggtcctccga tagttgtcag aagtaagttg 34200 
gccgcagtgt tatcactcat ggttatggca gcactgcata attctcttac tgtcatgcca 34260 
tccgtaagat gcttttctgt gactggtgag tattcaacca agaatacggg ataataccgc 34320 
gccacatagc agaactttaa aagtgctcat cattgggaaa cgttcttcgg ggcgaaaact 34380 
ctcaaggatc ttaccgctgt tgagatccag ttcgatgtaa cccactcgcg cacccaagtg* 34440 
atcttctgca tcttttactt tcaccagcgt ttctgggtga gcaaaaacag gaaggcaaaa 34500 
tgccgcaaaa aagggaataa gggcgacacg gaaatgttga atactcatac ttttcctttt 34560 
tcaatattat tgaagcattt atcagggtta ttgtctcatc agcggataca tatttg 34616 



<210> 3 
<211> 31672 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> derived from Adenovirus 
<400> 3 

gaatcggcca gcgcgaatta actataacgg tcctaaggta gcgtcatcat cataatatac 60 
cttattttgg attgaagcca atatgataat gagggggtgg agtttgtgac gtggcgcggg 120 
gcgtgggaac ggggcgggtg acgtaggttt tagggcggag taacttgcat gtattgggaa 180 
ttgtagtttt tttaaaatgg gaagttacgt atcgtgggaa aacggaagtg aagatttgag 240 
gaagttgtgg gttttttggc tttcgtttct gggcgtaggt tcgcgtgcgg ttttctgggt 300 
gttttttgtg gactttaacc gttacgtcat tttttagtcc tatatatact cgctctgtac 360 
ttggcccttt ttacactgtg actgattgag ctggtgccgt gtcgagtggt gttttttaat 420 
aggttttttt actggtaagg ctgactgtta tggctgccgc tgtggaagcg ctgtatgttg 480 
ttctggagcg ggagggtgct attttgccta ggcaggaggg tttttcaggt gtttatgtgt 540 
ttttctctcc tattaatttt gttatacctc ctatgggggc tgtaatgttg tctctacgcc 600 
tgcgggtatg tattcccccg ggctatttcg gtcgcttttt agcactgacc gatgttaacc 660 
aacctgatgt gtttaccgag tcttacatta tgactccgga catgaccgag gaactgtcgg 720 
tggtgctttt taatcacggt gaccagtttt tttacggtca cgccggcatg gccgtagtcc 780 
gtcttatgct tataagggtt gtttttcctg ttgtaagaca ggcttctaat gtttaaatgt 840 
ttttttttgt tattttattt tgtgtttaat gcaggaaccc gcagacatgt ttgagagaaa 900 
aatggtgtct ttttctgtgg tggttccgga acttacctgc ctttatctgc atgagcatga 960 
ctacgatgtg cttgcttttt tgcgcgaggc tttgcctgat tttttgagca gcaccttgca 1020 
ttttatatcg ccgcccatgc aacaagctta cataggggct acgctggtta gcatagctcc 1080 
gagtatgcgt gtcataatca gtgtgggttc ttttgtcatg gttcctggcg gggaagtggc 1140 
cgcgctggtc cgtgcagacc tgcacgatta tgttcagctg gccctgcgaa gggacctacg 1200 
ggatcgcggt atttttgtta atgttccgct tttgaatctt atacaggtct gtgaggaacc 1260 
tgaatttttg caatcatgat tcgctgcttg aggctgaa^g tggagggcgc tctggagcag 1320 
atttttacaa tggccggact taatattcgg gatttgctta gagacatatt gataaggtgg 1380 
cgagatgaaa attatttggg catggttgaa ggtgctggaa tgtttataga ggagattcac 1440 
cctgaagggt ttagccttta cgtccacttg gacgtgaggg cagtttgcct tttggaagcc 1500 
attgtgcaac atcttacaaa tgccattatc tgttctttgg ctgtagagtt tgaccacgcc 1560 
accggagggg agcgcgttca cttaatagat cttcattttg aggttttgga taatcttttg 1620 
gaataaaaaa aaaaaaaaca tggttcttcc agctcttccc gctcctcccg tgtgtgactc 1680 
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gcagaacgaa tgtgtaggtt ggctgggtgt ggcttattct gcggtggtgg atgttatcag 1740 
ggcagcggcg catgaaggag tttacataga acccgaagcc agggggcgcc tggatgcttt 1800 
gagagagtgg atatactaca actactacac agagcgagct aagcgacgag accggagacg 1860 
cagatctgtt tgtcacgccc gcacctggtt ttgcttcagg aaatatgact acgtccggcg 1920 
ttccatttgg catgacacta cgaccaacac gatctcggtt gtctcggcgc actccgtaca 1980 
gtagggatcg cctacctcct tttgagacag agacccgcgc taccatactg gaggatcatc 2040 
cgctgctgcc cgaatgtaac actttgacaa tgcacaacgt gagttacgtg cgaggtcttc 2100 
cctgcagtgt gggatttacg ctgattcagg aatgggttgt tccctgggat atggttctga 2160 
cgcgggagga gcttgtaatc ctgaggaagt gtatgcacgt gtgcctgtgt tgtgccaaca 2220 
ttgatatcat gacgagcatg atgatccatg gttacgagtc ctgggctctc cactgtcatt 2280 
gttccagtcc cggttccctg cagtgcatag ccggcgggca ggttttggcc agctggttta 2340 
ggatggtggt ggatggcgcc atgtttaatc agaggtttat atggtaccgg gaggtggtga 2400 
attacaacat gccaaaagag gtaatgttta tgtccagcgt gtttatgagg ggtcgccact 2460 
taatctacct gcgcttgtgg tatgatggcc acgtgggttc tgtggtcccc gccatgagct 2520 
ttggatacag cgccttgcac tgtgggattt tgaacaatat tgtggtgctg tgctgcagtt 2580 
actgtgctga tttaagtgag atcagggtgc gctgctgtgc ccggaggaca aggcgtctca 2640 
tgctgcgggc ggtgcgaatc atcgctgagg agaccactgc catgttgtat tcctgcagga 2700 
cggagcggcg gcggcagcag tttattcgcg cgctgctgca gcaccaccgc cctatcctga 2760 
tgcacgatta tgactctacc cccatgtagg cgtggacttc cccttcgccg cccgttgagc 2820 
aaccgcaagt tggacagcag cctgtggctcr agcagctgga cagegacatg aacttaagcg -2880 
agctgcccgg ggagtttatt aatatcactg atgagcgttt ggctcgacag gaaaccgtgt 2940 
ggaatataac acctaagaat atgtctgtta cccatgatat gatgcttttt aaggccagcc 3000 
ggggagaaag gactgtgtac tctgtgtgtt gggagggagg tggcaggttg aatactaggg 3060 
ttctgtgagt ttgattaagg tacggtgatc aatataagct atgtggtggt ggggctatac 3120 
tactgaatga aaaatgactt gaaattttct gcaattgaaa aataaacacg ttgaaacata 3180 
acatgcaaca ggttcacgat tctttattcc tgggcaatgt aggagaaggt gtaagagttg 3240 
gtagcaaaag tttcagtggt gtattttcca ctttcccagg accatgtaaa agacatagag 3300 
taagtgctta cctcgctagt ttctgtggat tcactagtgc cattaagtgt aatggtaagt 3360 
atcataggtt tagttttatc accatgcaag taaacttgac tgacaatgtt atttttagca 3420 
gtttgacttt gggtttttgg ataggctaga aggttaggca taaatccaac tgcatttgtg 3480 
tatggatttg cattagttga gttcccattt ctaaagttcc agtaatgttt tttaagtgag 3540 
gagttctcca ttagaacacc gttttggtca aatctaagga atatactaac acttgcaacg 3600 
gtgcctgtca tggatgaaag atctccagat acagccaaag cagctacagt agctagtact 3660 
tgactcccac attttgtaag aaccaaagta aatttgcagt cattatctga atgaattctg 3720 
cagttaggag atgggtctgg ggttgtccac agggtaagtt tgtcatcatt tttgtttcct 3780 
attgtaatgg cccctgagtt gtcaaagctt aaacccgctc caagtttagt aatcatggca 3840 
ccgttttcat tgtaatcaat gccagagcca attttagttt ttattgggtt gatatctgga 3900 
gactcagatg tgtttgtatc aaactccaga ccctttcctg catttatagc tatggcagta 3960 
ttatcaaagt ttagtccact ggattttttt atgctaactt ccagtttttt agtattgttt 4020 
gatgcattaa aaaggtatag gcctctgtta tagtttatgt ccaagttatg agatgcatta 4080 
atatacaggg gtccctgccc cagtttaaga cgtagttttg tttgagcatc aaatgggtaa 4140 
tccacatcta gaattaacaa gttgttattt atacgcatgc caccgcccgt tttaatttcc 4200 
atgttgtttg atgaatcata accaatagct cctgcaactt tggttctaag ggagttttgt 4260 
tcaacggtga cacctggtcc agtaactact gttagtgtat cggagttttg tgctacttgc 4320 
aaaggaccgc ttattttaat tcctattttt ccattattta cataaatagg atcttccatg 4380 
ttaatgccca agctacccgt ggcagtagtt agcgggggtg atgcagttac agtaagggtg 4440 
tcgctgtcac tgccagagag gggggctgat gtttgcaggg ctagctttcc atctgacact 4500 
gtaatgggcc ctttagtagc aatgcttagt ttggagtctt gcacggtcag tggggcttgt 4560 
gactgtacgc taagagcgcc gctagtaact atcagaggag cggtggttgc cactgttagg 4620 
gcgcctgagg taattgtaag tggtgcggag gtgtccaaac ttatgtttga ctttgttttt 4680 
ttaagtggct gagtaacagt ggttacattt tgggaggtga ggtttccggc cttgtctagg 4740 
gtaagaccgc tgcccatttt aagcgcaagc atgccgtggg aggtgtccaa aggttcggag 4800 
acgcgtagag agagaactcc agggggactt tcttggaaac cattgggtga aacaaatgga 4860 
ggggtaagaa agggcacagt tggaggcccg gtttctgtgt catatggata cacggggttg 4920 
aaggtgtctt cagacggtct ggcgcgtttc atctgcaaca atatgaagat agtgggtgcg 4980 
gagggacaag aacatgagga atttgacatc ccatttaaac tttggagaaa gtttgcagct 5040 
aaaaggcggc tgagatacca gagttgggag gaaggaaagg aggtgatgct gaataagctg 5100 
gacaaagatt tgctgactga ttttaagtaa gtaatttatt cagtcgtagc cgtccgccga 5160 
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gtctttcacc gcgtcaaagt tgggaataaa 
aaaggggttg aagtaaaccg aaggcacgaa 
tccggagcgc ggctccgagg acgaggtctg 
tgaagagcgg ccagcgccgc cgatctgaaa 
gctcaccgac tcgtcgttga gctgaatacc 
ctgcccgggc gaccgcaccc tgtgacgaaa 
catctgaact tcggcctggg cgtctctggg 
ctggtacacc agggcagcgg gccaactacg 
aatagccgcc tgttcgagga gaattcggtt 
tatcatgtgg ggtcccgcgc tcatgtagtt 
agccgcaagt cccatttgtg gctggtaact 
cataatggcg ctgacgacag gtgctggcgc 
tcgcgcttaa atttgagaaa gggcgcgaaa 
gctgaagaga gcctccgcgt cttccagcgt 
caggcagctg cgggtgaggg agcgcagaga 
gcccctgctt tgttgaaata tagcatacag 
ggtcgatacg ggttcgttgg gcgccagacg 
tgtggatttc ttgggctttg tcagagtctt 
cgctgctgtt gctgccgctg ccgctgccgc 
agtaatgcag gatgttac'gg" gggaaggcca 
cgaaggagat gttgccccca cagtcttgca 
gccacgagcg gtagccttgg cgctgttgtt 
acttaccggc cctggttcca gtggtgtccc 
ccggcggcgc ctgaggagcg gaggttgtag 
gcgccggcga ggggaatgcg accgagggtg 
cggaagcttc gtctaggctg tcccagtctt 
cctctgcctg actgtcccag tattcctcct 
gcttcttttt gggtgccatc ctgggaagca 
ggcgggggga ttgggttgag ctcctcgccg 
tttcgtagca gaaactcttg gcgggctttg 
ccctgggtaa tgacgcaggc ggtaagctcc 
aacctaatct cgtgggcgtg gtagtcctca 
cacagccccg gagtgagttt caaccccgga 
tgcagctcaa aggtaccgat aatttgactt 
gagcggtgcg gggtgcatag gttgcagcga 
acgtcttcca tgatgtcgga gtggtaggca 
tgaccccaaa gcggcggagg gcattcacgg 
gcacagcagg tggcgggcag aattcctgaa 
aacatgcttt gactggtgaa gtctggcaga 
gggaagataa tgtccgccag gtgcgcggcc 
tccttcaagt tttgctttag cagcttctgc 
tgctgccaca cgcccatggc cgtttgccag 
cggacgtagt cgcggcgcgc ctcgcccttg 
cggttttcgt gcaaaattcc aaggtaggag 
ttgcaggcct ggcgcacgta gccctggcga 
cgctgcatct ccgggtcagc aaagaaccgc 
actgcggcca tcattagctt gcgtcgctcc 
cagcgcgcca gctgctcatc gccaactgcg 
tttgcatccc tctccagggg tcgtgcacgg 
ataaccttgg ggggtaggtt aagtgccggg 
ttcagcacgg ctaggcgcgc gttgtcaccc 
tcattttcgc tgttttcttg ttgcagagcg 
ccctcaaaga tttttggcac ttcgtcgagc 
cgcaaggcca gctgcttgtc cgctcggctg 
cagttttgga aaaagatgtg ataggtggca 
aagttgaggc gcgggttggg ctcgcatgtg 
ggtgagaaca ggtggcgttc gtaggcaagg 
ctgcgctctt gcaacgcgtc gcagataatg 



ctggtccggg tagtggccgg gaggtccaga 5220 
ctcctcaata aattgtagag ttccaatgcc 5280 
cagagttagg atcgcctgac ggggcgtaaa 5340 
tgtcccgtcc ggacggagac caagagagga 5400 
tcgccctctg attttcaggt gagttatacc 5460 
gccgcccgca agctgcgccc ctgagttagt 5520 
aagtaccaca gtggtgggag cgggactttc 5580 
gggattaagg ttattacgag gtgtggtggt 5640 
tcggtgggcg cggattccgt tgacccggga 5700 
tattcgggtt gagtagtctt gggcagctcc 5760 
ccacatgtag ggcgtgggaa tttccttgct 5820 
cgggtgtggc cgctggagat gacgtagttt 5880 
ctagtcctta agagtcagcg cgcagtattt 5940 
gcgccgaagc tgatcttcgc ttttgtgata 6000 
cctgtttttt attttcagct cttgttcttg 6060 
agtgggaaaa atcctatttc taagctcgcg 6120 
cagcgctcct cctcctgctg ctgccgccgc 6180 
gctatccggt cgcctttgct tctgtgtgac 6240 
cggtgcagta ggggctgtag agatgacggt 6300 
cgccgtgatg g tagagaaga aagcggcggg 6-3 60 
agcaagcaac tatggcgttc ttgtgcccgc 6420 
gctcttgggc taacggcggc ggctgcttag 6480 
atctacggtt gggtcggcga acaggcagtg 6540 
cgatgctggg aacggttgcc aatttctggg 6600 
acggtgtttc gtctgacacc tcttcggcct 6660 
ccatcatctc ctcctcctcg tccaaaacct 6720 
cgtccgtggg tggcggcggc ggcagctgca 6780 
agggcccgcg gctgctgata gggctgcggc 6840 
gactgggggt ccaggtaaac cccccgtccc 6900 
ttgatggctt gcaattggcc aaggatgtgg 6960 
gcatttggcg ggcgggattg gtcttcgtag 7020 
ggtacaaatt tgcgaaggta agccgacgtc 7080 
gccgcggact tttcgtcagg cgagggaccc 7140 
tcgctaagca gttgcgaatt gcagaccagg 7200 
cagtgacact ccagtaggcc gtcaccgctc 7260 
aggtagttgg ctagctgcag aaggtagcag 7320 
tacttaatgg gcacaaagtc gctaggaagc 7380 
cgctctagga taaagttcct aaagttttgc 7440 
ccctgttgca gggttttaag caggcgttcg 7500 
acggagcgct cgttgaaggc cgtccatagg 7560 
agctccttta ggttgcgctc ctccaggcat 7620 
gtgtagcaca gaaataagta aacgcagtcg 7680 
agcgtggaat gaagcacgtt ttgcccgagg 7740 
accaggttgc agagctccac gttggaaatt 7800 
aaggtgtagt gcaacgtttc ctctagcttg 7860 
tgcatgcact caagctccac ggtaacaagc 7920 
tccaagtcgg caggctcgcg cgtctcaagc 7980 
ggtaggccct cctcggtttg ttcttgcaag 8040 
cgcacgatca gctcgctcat gactgtgctc 8100 
taggcaaagt gggtgacctc gatgctgcgt 8160 
tcaagttcca ccagcactcc acagtgactt 8220 
tttgccgcgc gtttctcgtc gcgtccaaga 8280 
gaggcgatat caggtatgac agcgccctgc 8340 
cggttggcac ggcaggatag gggtatcttg 8400 
agcacctctg gcacggcaaa tacggggtag 8460 
ccgttttctt; ggcgtttggg gggtacgcgc 8520 
ctgacatccg ctatggcgag gggcacatcg 8580 
gcgcactggc gctgcagatg cttcaacagc 8640 
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acgtcgtctc ccacatctag gtagtcgcca 
tcgtttgcct ctgcgtcgtc ctggtcttgc 
tcgtcgtctt cgcttacaaa acctgggtcc 
gggggtgcct cgacggggaa ggtggtaggc 
gcgaactcaa agggggcggt taggctgtcc 
tgcctatagg agaaggaaat ggccagtcgg 
cgcggacgcg gtgcggcgcg acgtccacca 
tcgccgccgc ctccccgcgc gcccccaaaa 
gacgaagaag actcgtcaca agatgcgctg 
acctcgacgg cggatttggc cattgcgtcc 
cccgagcgcc cgccatcccc agaggtgatc 
gcgctacaaa tggtgggttt cagcaaccca 
aagcgcacgg tgcggcggct gaatgaagac 
gaggaaaagg aagagtccag tgaagcggaa 
ctgccgatcg tgtctgcgtg ggagaagggc 
taccacgtgg ataacgatct aaaggcaaac 
ctggcggccg tatgcaagac ctggctaaac 
accagcaaca agacctttgt gacgatgatg 
tttgcagagg taacctacaa gcaccacgag 
fcgcgc fcg ags fecgasgg eg s get fcaagtg l. 
cacgtgattg aaatggatgt gacgagcgaa 
agcaaggcca agatcgtgaa gaaccggtgg 
gacgcaaggt gctgcgtgca tgacgcggcc 
tgcggcatgt tcttctctga aggcgcaaag 
ttcatgcagg cgctgtatcc taacgcccag 
cggtgcgagt gcaactcaaa gcctgggcat 
ttgactccgt tcgccctgag caacgcggag 
agcgtgctgg ccagcgtgca ccacccggcg 
tatcgcaact cgcgcgcgca gggcggaggc 
gacctgctaa acgcgttggt gatggtgcgc 
ccgcggatgg ttgtgcctga gtttaagtgg 
ctgccagtgg cgcatagcga tgcgcggcag 
gcaagggtgg ggggtaaata atcacccgag 
gaaagtgtct cctagtacat tatttttaca 
tcctaatctg cgcactgtgg ctgcggaagt 
gctgttcctg gttgcgacgc agggtgggct 
gtaccccggt aataaggttc atggtggggt 
caaaggcgtg gagaaacatg cagcagaata 
cgctttgggt ggacttttcc agcgttatac 
ggcgcaggag tgactcgtac tcaaactggt 
caaagggctc aaagaggtag catgtttttg 
gtacgccccc agtctcgcga ccggccgtat 
aaacaaagcc tggaaagcgc ttgtcatagg 
ctttgacaat ggctttcagt tcctgctcac 
ttgcttgctt cttttatgtt gtggcgttgc 
tctcgatgac gccgcggtgc ggctggtgca 
cataaagaag ggtgggctcg tccatgggat 
cggagttggc gtagagaagg ttttggccca 
tactggagaa tgggatgcgc caaagggtgc 
tgtcaaccgc ggttttgcct attagtgggt 
cgcgcatggt gggagcgagg tagcctacga 
caacctgctg atactccttg tatttagtat 
agtttctgaa gaacgagtac atgcggtcct 
agccaatatt gtagttggcc aacatctgca 
gagctacgtt gtagccctcc ccgtcaactg 
gcaggcggtc gttgcccggc cagctaacag 
aggtgtgatt aagatagaag gttccgtcaa 
aagggtcgta gcctgatccc agggaagggg 



tgcctttggt ccccccgccc gacttgttcc 8700 
tttttatcct ctgttggtac tgagcgatcc 8760 
tgctcgataa tcacttcctc ctcctcaagc 8820 
gcgttggcgg catcggtgga ggcggtggtg 8880 
tccttctcga ctgactccat gatctttttc 8940 
gaagaggagc agcgcgaaac cacccccgag 9000 
accatggagg acgtgtcgtc cccgtcgccg 9060 
aagcggctga ggcggcgtct cgagtccgag 9120 
gtgccgcgca cacccagccc gcggccatcg 9180 
aaaaagaaaa agaagcgccc ctctcccaag 9240 
gtggacagcg aggaagaaag agaagatgtg 9300 
ccggtgctaa tcaagcacgg caagggaggt 9360 
gacccagtgg cgcggggtat gcggacgcaa 9420 
agtgaaagca cggtgataaa cccgctgagc 9480 
atggaggctg cgcgcgcgtt gatggacaag 9540 
ttcaagctac tgcctgacca agtggaagct 9600 
gaggagcacc gcgggttgca gctgaccttc 9660 
gggcgattcc tgcaggcgta cctgcagtcg 9720 
cccacgggct gcgcgttgtg gctgcaccgc 9780 
cfcacacggga—gcafcfcafcgafc aaafcaaggag -9840 
aacgggcagc gcgcgctgaa ggagcagtct 9900 
ggccgaaatg tggtgcagat ctccaacacc 9960 
tgtccggcca atcagttttc cggcaagtct 10020 
gctcaggtgg cttttaagca gatcaaggct 10080 
accgggcacg gtcaccttct gatgccacta 10140 
gcaccctttt tgggaaggca gctaccaaag 10200 
gacctggacg cggatctgat ctccgacaag 10260 
ctgatagtgt tccagtgctg caaccctgtg 10320 
cccaactgcg acttcaagat atcggcgccc 10380 
agcctgtgga gtgaaaactt caccgagctg 10440 
agcactaaac accagtatcg caacgtgtcc 10500 
aacccctttg atttttaaac ggcgcagacg 10560 
agtgtacaaa taaaaacatt tgcctttatt 10620 
tgtttttcaa gtgacaaaaa gaagtggcgc 10680 
agggcgagtg gcgctccagg aagctgtaga 10740 
gtacctgggg actgttaagc atggagttgg 10800 
tgtgatccat gggagtttgg ggccagttgg 10860 
gtccacaggc ggccgagttg ggcccctgca 10920 
agcggtcggg ggaagaagca atggcgctac 10980 
aaacctgctt gagtcgttgg tcagaaaagc 11040 
agcgcgggtt ccaggcaaag gccatccagt 11100 
tgactatggc gcaggcgagc ttgtgtggag 11160 
tgcccaaaaa atatggccca caaccaagat 11220 
tggagcccat ggcggcagct gttgttgatg 11280 
cggccgagaa gggcgtgcgc aggtacacgg 11340 
cacggaccac gtcaaagact tcaaacaaaa 11400 
ccacctcaaa agtcatgtct agcgcgtggg 11460 
ggtctgtgag tgcgcccatg gacataaagt 11520 
gatcgcaaag aaactttttc tgggtaatac 11580 
agggcacgtt ggcggggtaa gcctgtccct 11640 
atcctgagtt gttatgctgg tgaagaattc 11700 
cgtcaaccac ttgccggctc atgggctgga 11760 
tgtagctttc tggaatgtag aagccctggt 11820 
ccaggaacca gtccttggtc atgttgcact 11880 
agcgtttaat ctcaaactca ttgggagtaa 11940 
aagagtcaaa ggtaatggcc accttcttaa 12000 
ggtatggtat ggagccagag taggtgtagt 12060 
tttcctttgt cttcaagcgt gtgaaggccc 12120 
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aaccgcgaaa tgctgcccag ttgcgcgatg 
cgggtatggg gtatagcatg ttggcggcgg 
tgtcatttct gagcatggct tccagcgtgg 
tggcgtaaag acaaatgctg tcaaacttaa 
ccagagagct ctgcagaacc atgttaacat 
agcctggcag gaggaggagg tttttaatgg 
agggcacgta gcggccgttt cccaacaaca 
ggtggttaaa gggattaacg ttgtccatgt 
tgtagcagtc tacaagcccg ggagccacca 
ggttgtcaga tatttccaca ttggtggggt 
caatattgga gtaaaggaaa tttctccata 
agttgttacc cactcctatt tcattacgtg 
tatctccatt atcgcctgag ccattgccat 
ccccaatacc cccaagagga aaacaataat 
caatgattct aacatctgga tcatagctgt 
tggttctatc acctatggaa tcaagcaaga 
gcaaatctac cacggcattt agctgcgatg 
tgctgttata a'tacattagg ccaataaaat 
ttggcataga ttgttgaccc aacatagctt 
aagaeagafeg tgtgtefeggg gt-ttcea£afe 
gtttagtagc attgccttgc cggtcgbtca 
caacctttgg aagaggcacc cctttttcat 
gatttgtagg cctggcataa gatccatagc 
ctctccctcc tgccgcatta gcatcagctt 
ctggttgata ggaaggatct t gcgtatacag 
ctatttgtag cccgcttttt gtaattgttt 
gtgttttctt agtagcctga tctcgagcgt 
catcttcctc ttcttcatcc tcggcaactg 
cacaggagtt aggagcgccc ttgggagcta 
taaaagtagg ccccctgtcc agcacgccgc 
gcacacggtt gtcacccaca gccagggtga 
cgcggtccac agggatgaac cgcagcgtca 
gcgtaggtgc caccgtgggg tttctaaact 
cgcgggcaaa ctgcaccagc ccggggctca 
gcatgtaaga ccactgcggc atcatcgaag 
cggctcagca gctcctctgg cggcgacatg 
ctatttagaa gcatcgtcgg cgcttcaggg 
gtgtgctttg ccagttgcca ctggctacgg 
ggcgcaggga cgcgcggcta gggcgggtta 
gtttctgctg ggtgtcagcg gggggaggca 
gcactccggt agccatgggc gcgatgggac 
cctcgtacga gggaggctca tctatttgcg 
gacgcttttc gccacgcccc tctggagaca 
cgggagggcg gggatcaagc ttactgttaa 
ccaccacccc gctaatgcca gaggccaggc 
ctttcaactt gtccctcagc atctggcctg 
tcttaatggt ggaaccgaaa tttttaatgc 
caccgctcat attgctggtg ccgatatctt 
gtcgcggggc cagagacgca aagttgatgt 
agcgagcgtg agactccaga ctttttattt 
cagtgtctct gcgcctgcaa ggccacggat 
gcgatcagtg gaataaggag gggcaggata 
cgccgccggt ggtgcgcacg acgcatgccg 
ctacggtgca ttcttcctcg gaatcccggc 
atatctgcaa gaaccacaaa gaccggcttt 
ggcagcacca gggtcctgcc tccttcgcga 
acgggctggc gacggcgacg gcggcggcgg 
gctcgtcttc tggggcggta ggtgtagcca 



ggatggagat gggcacgttg gtggcgttgg 12180 
aaaggtagtc attaaaggac tggtcgttgg 12240 
aggccgtgtt gtgggccatg gggaagaagg 12300 
tgctagcccc gtcaactcta agatcgtttc 12360 
ccttcctgaa gttccattca tatgtatatg 12420 
caaaaaactt ttggggcacc tgaatgtgaa 12480 
tggagcgata acggaggcec gcattgcggt 12540 
agtccagaga ccagcgcgcc ccaaggttaa 12600 
ctcgcttgtt catgtagtcg taggtgttgg 12660 
tgtattttag cttgtctggc aggtacagcg 12720 
ggttggcatt taggttaatt tccatggcaa 12780 
ttgcaaaagt ttcatctttt gtccatgtag 12840 
tagccttaat agcttgatag gtgtcagtta 12900 
ttggcaattc atcctcagtt ccatggtttt 12960 
ctacagcctg attccacata gaaaaatatc 13020 
gttgatagga cagctctgtg tttctgtctt 13080 
cctgaccagc aagaacaccc atgttgccag 13140 
tgtccctgaa agcaatgtaa ttgggtctgt 13200 
tagaattttc atcacctttt ccaggtttgt 13260 
ttaeafecfcfee aetgtaeaaa accacfcfefefeg 13320 
aagaggtagt atttgagaag aattgcaagt 13380 
ccggaaccag aacggattga ccaccaaaag 13440 
atggtttcat gggagttgtt tttttaagca 13500 
cgttccactg agattcgcca atttgaggtt 13560 
gtttagcttg tgtttctgca ttgtctgatc 13620 
ctccagacaa aggagcctgg gcatagacat 13680 
tttgctcttc ttcttcctct tcttcatctt 13740 
cccggccgct atcttcggtt tgttcccact 13800 
gagcgttgta ggcagtgccg gagtagggct 13860 
ggatgtcaaa gtacgtggaa gccatatcaa 13920 
accgcgcttt gtacgagtac gcggtatcct 13980 
aacgctggga ccggtctgtg gttacgtcgt 14040 
tgttattcag gctgaagtac gtctcggtgg 14100 
ggtactccga ggcgtcctgg cccgagatgt 14160 
gggtagccat cttggaaagc gggcgcacgg 14220 
gacgcataca tgacacatac gacacgttag 14280 
attgcacccc cagacccacg atgctgttca 14340 
gccgcatcga tcgcggaccg ctggcggcac 14400 
caacaacggc ggacggccct ggcagcacag 14460 
ggtccagcgt tacaggtgtg tgctggccca 14520 
gggtggtggg caggccttgc tttagtgcct 14580 
tcaccagagt ttcttccctg tcgggccgcg 14640 
ctgtctccac ggccggtgga ggctcctcta 14700 
tcttattttg cactgcctgg ttggccaggt 14760 
catctaccac cttttgttgg aaattttgct 14820 
tgctgctgtt ccaggccttg ctgccatagt 14880 
cgctccacag cgagccccag ctgaaggcgc 14940 
gccagtttcc catgaacggg cgcgagccgt 15000 
cttccattct acaaaatagt tacaggacca 15060 
tgatttttcc acatgcaact tgtttttaat 15120 
gcaattccgg gcacggcgcc aatcgccgcg 15180 
ccgccgcgca tgcgacggtg cgacgcgcgc 15240 
cccgtcaggc cgtggccggc catgcccctc 15300 
accgggaaac ggaggcggca ggtgagggcc 15360 
taaacgatgc tggggtggta gcgcgctgtt 15420 
gccaccctgc gcacggaaat cggggccagc 15480 
gttccagtgg tggttcggcg tcgggtagtc 15540 
cgatagccgg gggtaggcgc gatggaagga 15600 
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tgtagggcat attcgggcag tagtgcgctg 
cgccgggggg ctgaaacgcg aaacatccac 
gacgcggccg cagcggccgc ctgcaccgcg 
gtttgtgtct ccatgccctc tgtggcagtg 
tgaacgtcca cggtctgcac gcccagtccc 
acctcgggct ccagcccagg ctccacggtc 
cgcttgggta ccatcagctg cacggtgggt 
cgcttttctt cggacggtgc aagcgtgggc 
ctaggtgttg ggttgccctc gtccagcggc 
taggcaaact ccccgaggcg ctcgttggcc 
tcatcataca cgcgcttgta ggtgcgggtg 
gtgccgggtc gcaaaacacg tcttacgcgt 
gcggttgcgt gcagcagttc cacctcgtcg 
ttctttttga cccgctttag ctttcggggc 
ccatagatct ccggcgcgat gacctggagc 
gcttcgttgc gcgccgccgc cgctggatac 
cttgcaatct agttgcgcgg ggggcgggtg 
tcgcgcaccc agtacacgtt gcccctgcga 
gctgcggcgg ccgctcgtcg cctggacctg 
ct:tcgagcgg~cccgcatrggc~ "cgcccgtcgg" 
gccgccgcgc gttgggcggc agtgccgggt 
cgccgtctct tcattttagc ataacgccgg 
tccactgtgg acactggtgg cggcgtgggc 
tcaatggcgt catcgacggt ggtgcgccca 
gcgcggtagt gcccgcgcac gcgcactggg 
aacatcttgc ttgggaagcg caggccccag 
gacatgtttg ctcaaaaagt gcggctcgat 
ttgtaaacgt aggggcaggt gcggcgtctg 
ccgatgctgt tgcgcagcgg tagcgtcccg 
acggtggtga tggtgggggc tggcgggcgc 
aacacgtggg tcagagaggt aaactggcgg 
tagaagctct tggagtgcac gggcaacagc 
tggctcgtgg agcggaaggt cacggggtct 
tgctccgagc cgcaggttac gtcaggagtg 
tgagggtcgc cgtagttgta tgcaaggtac 
ttgcttatta ggttgtaact gcgtttcttg 
ttcttctgag gcttctcgac ctcgggttgc 
tcggcctcag cgcgcttctc ctccgcccgt 
tcgttcatgt cctccaccgg ctgcattgcc 
ccgctgccac tgttgttgcc gccgcctgcg 
aagcttgcct ggtaggcgtc cacatccaac 
tcgtaggtga tcctaaagcc ctcctggaag 
ctcaggcggc tgtgggtgaa gtccaccccg 
aaggcttcgt ttgtatatac cccaggcatg 
ctgaagttgc gggtgtcaaa ctttaccccg 
cccactttca agtagtgctc cacgatcgcg 
gagtagttgc cctcgggcag cgtgaactcc 
tccttagtaa gcgagcgcga caeca tcacc 
tegttcacat ttggcatgtt ggtatgeagg 
eggtegtcaa gattgatggt ctgtgtgctt 
atgaccgtgg ttagaaagtt gctgtggtcg 
gacttgttgt ccacaaggta cacaegggtg 
eggatgetgt ttctcccccc ggtaggcege 
aggggagcat cgaaggggga acccagcgcc 
tegtaggagg gaggaggacc ttcctcatac 
caagaaaacc aacgeteggt gccatggcct 
tttttttttt ttttaaaaca ttctccccag 
actccctccc aaatccagga cgctgctgtc 



gcggtgccgt acttcctgga aeggegeggg 15660 
gggtccgttt gcacctccgt agaggttttg 15720 
gcatctgcca ccgccgaggc aaceggggae 15780 
gcaatactag tgctactggt ggtgggtatc 15840 
ggtgccacct gcttgattgg ccgcacgcgg 15900 
attttttcca agacatcttc cagtegctgg 15960 
gccaagtcac cagactcgcg etttaggecg 16020 
agcacctgct gcagtgtcac gggctttagg 16080 
aacgccaaca tgtccttatg ccgctttccg 16140 
tgetcaagea ggtcctcgtc gccgtacacc 16200 
gagcgctcac egggegtaaa aactaeggtg 16260 
cgacctttcc actgtacccg ccgcctgggc 16320 
tcaagttcat catcatcatc atctttcttt 16380 
ttgtaatcct gctcttcctt ettegggggg 16440 
atctcttctt tgattttgcg cttggacata 16500 
atacaacagt acgagtctaa gtagtttttt 16560 
cgcacgggca cgcgcaggcc gctaaccgag 16620 
ccctgagtca tagcactaat ggccgcggct 16680 
gggggcacag tgacaatacc cgcggccagc 16740 
ceggtgcgac~gtgcgcggtt~aagcagggcc 16800 
eggeggeggt ggcgacgtgc tacgcgcctc 16860 
gctccgcgca ccacggtctg aatggccgcg 16920 
gtgtagttgc gcgcctcctc caccaccgcg 16980 
gtgcggccgc gtttgtgcgc gccccagggc 17040 
tgttggtcgg agegcttett tgccccgcca 17100 
cctgtgttat tgctgggcga tataaggatg 17160 
aggacgegeg gcgagactat gcccagggcc 17220 
gcgtcagtaa tggtcactcg ctggactcct 17280 
tgatctgtga gagcaggaac gttttcactg 17340 
gecaaaatet ggttctcggg aaagcgattg 17400 
atgagctggg agtagaegge ctggtcgttg 17460 
tcggcgccca ccaccggaaa gttgetgate 17520 
tgcatcatgt ctggcaacga ccagtagacc 17580 
caaaggaggg tecatgageg gatcccggtc 17640 
cagctgeggt actgggtgaa ggtgctgtca 17700 
ctgtcctctg tcaggggttt gatcaceggt 17760 
geageggggg cggcagcttc tgccgctgcc 17820 
gtggcaaagg tgtcgccgcg aatggcatga 17880 
gcggctgccg cgttggagtt ctcttccgcg 17940 
ccatccccgc cctgttcggt gtcatctttt 18000 
agtgcgggaa tgttaccacc ctccaggtca 18060 
ggttgccgct tgeggatgee caacaagttg 18120 
catcctggca gcaaaatgat gtctggatgg 18180 
acaagaccag tgactgggtc aaaccccagt 18240 
atgtcgcttt ccagaacccc gttctgcctg 18300 
ttgttcataa ggtctatggt catggtctcg 18360 
acccactcat atttcagctc cacctgtttg 18420 
cgcgccttaa acttattggt aaacatgaac 18480 
atggttttca ggtcgccgcc ccagtgcgaa 18540 
gcctcccccg ggctgtagtc attgttttga 18600 
ttctggtagt teagggatge cacatccgtt 18660 
gtgtcgaata ggggtgccaa ctcagagtaa 18720 
aggtaccgeg gaggcacaaa cggcgggtcc 18780 
gccgccactg gcgccgcgct caccacgctc 18840 
atcgccgcgc getgeatact aaggggaata 18900 
tggtgagttt tttattttgc ateatgettt 18960 
ectggggega aggtgcgcaa aegggttgee 19020 
gtctgecgag tcatcgtcct cccacaccag 19080 
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accccgctga cggtcgtgcc ttfcgacgacg 
gtgctcctgc gcatacgtct tccatctact 
gttgggaaat gccggaggca ggttcttttc 
gtactcctcc tcgcccagca ggcgcgggcg 
caagcttgga aatgggctac tcgcatctga 
gctgcttggc ctgcggaagc tttcctttcg 
caactctagc agggtctgcg gttgcgggga 
gaatccatcg ttaccctcgg gcacctcaaa 
ccagtgcggg ttcaagatgg cattggtgaa 
caagtagtcc attaggcgat tgataaacgg 
gttgcgcgcg gtcatgtcca gcgccacgct 
gctcacgctc tgctgcacat agcgcaagat 
cgaggggatc ttctgccgcc ggttggtcag 
cgtgtcctcc tgccccagcg cgcggctgac 
cacatgcgcc tgacctatgg cctcgcggta 
ccgggacacg ctgccactgt ccgtgaaggg 
gggcgtcagc aagctagaca cggtcgcgcg 
cccctgcaag ttcttgaaag cctggctcag 
c tggaaaaaa tagtctggcc cggactggta 
tagccgcagt gcgctcacaa agttggtgta 
ctgtgtactc aggaaggcgt ttagtgcaac 
gcgctcacgc tgcgccacgg cctcgcgcac 
cacgttgccg ctgttgtaac gagccacgcg 
ctcatcgggc cggatggccc tgttttcggc 
gtgcgtgggg tttgcgcgcg ccgggaccac 
ctgcggctgc tgccggaacg cgtcagggtt 
gacctggcgc cagtcgtccg tggagttaag 
caccgccggg tccgttgcgt cttgcatcat 
ccgtcctctg gctcgtactc atcgtcctcg 
gcgcgcgcgg gtgccaccgc cagcccaggt 
cttggggccc agcgcaggtc agcgcccgcg 
ctgcccgtgc cagccagggc cctttgcagg 
cgccgccggc tcacgctcac ggccttgtgg 
tcgctcaagg taagcaccfct caacgccatg 
ttgtctatgg gaacgtaagg ggtatggtat 
agcatggaat agttaatggc ggccaccttg 
actatgctct gcagaatgtt tatcaaatcg 
tttagcagcg catccctgaa tgcctcgttg 
tgcgccatga gcggcttgct atttgggttt 
tgcatcagtc ctatagccac ctcctcgcgc 
cttttttgaa agttaatctc ctggttcacc 
gccgccacgt gtgcgcgcgc gggactaatc 
tcgcgcagca accgctcgcg gttcaggcca 
cgatcccgca tctcctcggg ctcctctccc 
acgtacgcct cgcgcgtgtc acgcttcagc 
gctcctagcc gcgccaggcc ctcgccctcc 
cgcgggggtt cgtaatcacc atctgccgcc 
gcggtaggag aaggggaggg tgccctgcat 
tgaggagggg ggcgcatctg ccgcagcacc 
tcgtccctgt ttccggagga atttgcaagc 
cgttcgccgc agtccggccg gcccgagact 
gaaaataacc ctccggctac agggagcgag 
cgcttacgcc gcgcgcggcc agtggccaaa 
aggaagccaa aaggagcgct cccccgttgt 
ggcggtaacc gcatggatca cggcggacgg 
gccatgatac ccttgcgaat ttatccacca 
tccttttgca cggtctagag cgtcaacgac 
accatggagc actttttgcc gctgcgcaac 



ggtgggcggg cgcgggccgg gcacatccct 19140 
catcttgtcc actaggctct ctatcccgtt 19200 
gcgctgcggc tgcagcagcg agttgtttag 19260 
ggtggtgcga gtgctggtaa aagaccctat 19320 
ccgcggggcc gcagcgccta gatcggacaa 19380 
cagcgccgcc tctgcctgct cgcgctgttg 19440 
aaacacgctg tcgtctatgt cgtcccagag 19500 
tcccccggtg tagaaaccag ggggcggtag 19560 
atactcgggg ttcacggcgg ccgcgcgatg 19620 
ccggtttgag gcatacatgc ccggttccat 19680 
gggcgttacc ccgtcgcgca tcaggttaag 19740 
gcgctcctcc tcgctgttta aactgtgcaa 19800 
caggtagttc agggttgcct ccaggctgcc 19860 
acttgtaatc tcctggaaag tatgctcgtc 19920 
cagtgtcagc aagtgaccta ggtatgtgtc 19980 
cgctattagc agcagcaaca ggcgcgagtt 20040 
gtcgcctgtg ggagcccgca ccccccacag 20100 
gtttacggtc tgcaggcctt gtctactggt 20160 
cacctcactt tgcggtgtct cagtcaccat 20220 
gtcctcctgif ccccgcggca cgtCggcggg 20280 
catggagccc aggttgccct gctgctgcgc 20340 
atcccccacc agccggtcca ggttggtctg 20400 
ctgaagcagc gcgtcgtaga ccaggccggc 20460 
cagcgcgttt acgatcgcca gcaccttctc 20520 
cgcttccaga attgcggaga gccggttggc 20580 
acgcgcagtc agcgacatga tgcggtccat 20640 
gccggacggc tggctctgca gcgccgcccg 20700 
ctgatcagaa acatcaccgc ttagtactcg 20760 
tcatattcct ccacgccgcc gacgttgcca 20820 
ccggccccag ctgcctccag ggcgcgtcgg 20880 
tcaaagtagg actcggcctc tctatcgccg 20940 
ctgtgcatca gctcgcggtc gctgagctcg 21000 
atgcgctcgt tgcgataaac gcccaggtcg 21060 
cgcatgtaga acccctcgat ctttacctcc 21120 
atcttgcggg cgtaaaactt gcccagactg 21180 
tcagccaggc tcaagctgcg ctcctgcacc 21240 
agcagccagc ggccctcggg ctctactatg 21300 
tccctgctgt gctgcactat aaggaacagc 21360 
tgctccagcg cgctfcacaaa gtcccacaga 21420 
gccacaagcg tgcgcacgtg gttgttaaag 21480 
gtctgctcgt acgcggttac caggtcggcg 21540 
ccggtccgcg cgtcgggctc aaagtcctcc 21600 
tgccgcaact cgcgccctgc gtggaacttt 21660 
tcgcggtcgc gaaacaggtt ctgccgcggc 21720 
tgcacccttg ggtgtcgctc aggagagggc 21780 
tccaagtcca ggtagtgccg ggcccggcgc 21840 
gcgtcagccg cggatgttgc ccctcctgac 21900 
gtctgccgct gctcttgctc ttgccgctgc 21960 
ggatgcatct gggaaaagca aaaaaggggc 22020 
ggggtcttgc atgacgggga ggcaaacccc 22080 
cgaaccgggg gtcctgcgac tcaacccttg 22140 
ccacttaatg ctttcgcttt ccagcctaac 22200 
aaagctagcg cagcagccgc cgcgcctgga 22260 
ctgacgtcgc acacctgggt tcgacacgcg 22320 
ccggatccgg ggttcgaacc ccggtcgtcc 22380 
gaccacggaa gagtgcccgc ttacaggctc 22440 
tgcgcacgcc tcaccggcca gagcgtcccg 22500 
atctggaacc gcgtccgcga ctttccgcgc 22560 
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gcctccacca ccgccgccgg catcacctgg 
cttatgttgg aagacctcgc ccccggagcc 
cagccgccgc cgcacttttt ggtgggatat 
gtctttgact caagggctta ctcgcgtctc 
cagaccgtta actggtccgt tatggccaac 
caccgctttg tggacatgga tgacttccag 
ttagccgagc gcgttgtcgc cgacctggcc 
acacgcatgg gaggaagagg gcgccaccta 
gatgcaagag atgcaggaca agaggaagga 
caagactact acaaagacct gcgccgatgt 
ctgcgcattc agcaggccgg acccaaggac 
aagaccgcct actttaatta catcatcagc 
cacccgctgc cgcccgccac ggtgctcagc 
tttctcgaga ggttttccga tccggtcgat 
gtacctacac aacaattgtt gagatgcatc 
cccccgccaa cccataaccg ggacatgacg 
aacggccgcg ccgtcaccga gaccatgcgc 
gtcgaccgcc tcccggtgcg ccgtcgtcgc 
gaagaagaag aagaagggga ggcccttatg 
gfcsgcetfctg agcgcgaggfc -gcgcgacacfc 
gagttaaccg tgtcggcgcg caactcccag 
gccatggagc gccttgaggc cttgggggat 
atgtacttct tcgtggcaga acacaccgcc 
cgaaactacg ccgtcttcgc ccggcacgtg 
gcccgcgatg ccgaaggggg cgtggtctac 
gccttctcgc agctcatggc ccgcatctcc 
ggacgcggag atctccagga ggaagagatc 
gacaactcag gagacgtgca ggagattttg 
gattctgtcg aactctcttt caggttcaag 
cgccagattc aggagatcaa ccgccgcgtc 
caccagctcc tgcccgcgcg cggcgccgac 
gagccccccc tacctccggg ggcccgcccg 
acacccccgc ggcccaccgc ccgccgcgcg 
tcttgcaagt catcgacgcc gccaccaacc 
tagcccgcgc cctgacccgg ctatgcgagg 
cgccgcggga gctccagacc atggacagct 
gaccgccgcg cgcggacatc tggactttgg 
taactcccct cgagcagcca gacggtcaag 
caaacccgcc aggcgagggg ctcaaattcc 
tcaacctcgt gcaggatgta cagcccgtgc 
aaagccagca cgagtgttcg gcccgtcgca 
actcctccaa ctggtggcgg gagatccagt 
agcgtctctt tgtcacctac gatgtagaga 
agctcgtgcc cttcatgctg gttatgaagt 
cgcgagacct agccgtggac cttggatggg 
actgcatcac cccagaaaaa atggccatag 
tgcaaatgct aatggcccgt gacctgtgga 
cagactgggc cctgtcagaa cacgggctca 
ttaaaaaatt gccctccatc aagggcaccc 
acaacatcaa cggcttcgac gagatcgtgc 
aggtgccggg acccttccgc atcacacgca 
tcaacgatgt caccttcgcc ctgccaaacc 
tctgggagca gggcggatgc gacgacactg 
ttagggacac ctttgcgctc acccacacct 
tacccgtaga aaagggatgc tgcgcctacc 
cttaccgttc ggaggccgac gggtttccga 
ttgtcctcaa ccgcgagctg tggaaaaaaa 
aaaccctgga ctactgcgcc ctagacgtgc 



atgtccaggt acatctacgg atatcatcgc 22620 
ccggccaccc tacgctggcc cctctaccgc 22680 
cagtacctgg tgcggacttg caacgactac 22740 
aggtacaccg agctctcgca gccgggtcac 22800 
tgcacttaca ccatcaacac gggcgcatac 22860 
tctaccctca cgcaggtgca gcaggccata 22920 
ctgcttcagc cgatgagggg cttcggggtc 22980 
cggccaaact ccgccgccgc cgtagcgata 23040 
gaagaagaag tgccggtaga aaggctcatg 23100 
caaaacgaag cctggggcat ggccgaccgc 23160 
atggtgcttc tgtcgaccat ccgccgtctc 23220 
agcacctccg ccagaaacaa ccccgaccgc 23280 
ctaccttgcg actgtgactg gttagacgcc 23340 
gcggactcgc tcaggtccct cggtggcgga 23400 
gttagcgccg tatccctgcc gcacggcagc 23460 
ggcggcgtct tccaactgcg cccccgcgag 23520 
cgtcgccgcg gggagatgat cgagcgcttt 23580 
cgccgtgtcc cccctccccc accgccgcca 23640 
gaagaggaga ttgaagaaga agaggcccct 23700 
ytcgccgagc tcatccgtct fcctggaggag 23760 
tttttcaact tcgccgtgga cttctacgag 23820 
atcaacgaat ccacgttgcg acgctgggtt 23880 
accaccctca actacctctt tcagcgcctg 23940 
gagctcaatc tcgcgcaggt ggtcatgcgc .24000 
agccgcgtct ggaacgaggg aggcctcaac 24060 
aacgacctcg ccgccaccgt ggagcgagcc 24120 
gagcagttca- tggccgaaat cgcctatcaa 24180 
cgccaggccg ccgtcaacga caccgaaatt 24240 
ctcaccgggc ccgtcgtctt cacgcagagg 24300 
gtcgcgttcg ccagcaacct ccgcgcgcag 24360 
gtgcccctgc cccctctccc ggcgggtccc 24420 
cgtcaccgct tttagatgca tcatccaagg 24480 
gtaccgtagt cgcgccgcgg ggatgcggcc 24540 
agcccctgga aatcaggtat cacctggacc 24600 
taaacctgca ggagctcccg cctgacctgt 24660 
cccatctgcg cgatgttgtc atcaagctcc 24720 
gc tcgcgcgg cgtggtggtc cgatccacca 24780 
gacaagcagc cgaagtagaa gaccaccagc 24840 
cactctgctt ccttgtgcgc ggtcgtcagg 24900 
accgctgcca gtactgcgca cgtttttaca 24960 
gggacttcta ctttcaccac atcaacagcc 25020 
tcttcccgat cggctcgcat cctcgcaccg 25080 
cctatacttg gatgggggcc tttgggaagc 25140 
tcggcggaga tgagcctctg gtgaccgccg 25200 
accgctggga acaagacccg cttaccttct 25260 
gtcgccagtt taggaccttt cgcgaccacc 25320 
gctcattcgt cgcttccaac cctcatcttg 25380 
gctcccctga ggagctcacc tacgaggaac 25440 
cgcgcttctt ggaactttac atcgtgggcc 25500 
tcgccgccca ggtaattaac aaccgttccg 25560 
actttatgcc tcgcgcggga aagatacttt 25620 
cgcgttccaa aaagcgcacg gactttttgc 25680 
acttcaaata ccagtacctc aaagtcatgg 25740 
cgctccggaa ggccgcgcag gcatacgcgc 25800 
aggccgtcaa ccagttctac atgctaggct 25860 
tccaagagta ctggaaagac cgcgaagagt 25920 
agggacagga taagtatgac atcatcaagg 25980 
aggtcaccgc cgagctggtc aacaagctgc 26040 
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gcgactccta cgcctccttc gtgcgtgacg cggtaggtct cacagacgcc agcttcaacg 26100 
tcttccagcg tccaaccata tcatccaact cacatgccat cttcaggcag atagtcttcc 26160 
gagcagagca gcccgcccgt agcaacctcg gtcccgacct cctcgctccc tcgcacgaac 26220 
tatacgatta cgtgcgcgcc agcatccgcg gtggaagatg ctaccctaca tatcttggaa 26280 
tactcagaga gcccctctac gtttacgaca tttgcggcat gtacgcctcc gcgctcaccc 26340 
accccatgcc atggggtccc ccactcaacc catacgagcg cgcgcttgcc gcccgcgcat 26400 
ggcagcaggc gctagacttg caaggatgca agatagacta cttcgacgcg cgcctgctgc 26460 
ccggggtctt taccgtggac gcagaccccc cggacgagac gcagctagac ccactaccgc 26520 
cattctgttc gcgcaagggc ggccgcctct gctggaccaa cgagcgccta cgcggagagg 26580 
tagccaccag cgttgacctt gtcaccctgc acaaccgcgg ttggcgcgtg cacctggtgc 26640 
ccgacgagcg caccaccgtc tttcccgaat ggcggtgcgt tgcgcgcgaa tacgtgcagc 26700 
taaacatcgc ggccaaggag cgcgccgatc gcgacaaaaa ccaaaccctg cgctccatcg 26760 
ccaagttgct gtccaacgcc ctctacgggt cgtttgccac caagcttgac aacaaaaaga 26820 
ttgtcttttc tgaccagatg gacgcggcca ccctcaaagg catcaccgcg ggccaggtga 26880 
atatcaaatc ctcctcgttt ttggaaactg acaatcttag cgcagaagtc atgcccgctt 26940 
ttgagaggga gtactcaccc caacagctgg ccctcgcaga cagcgatgcg gaagagagtg 27000 
aggacgaacg cgcccccacc cccttttata gccccccttc aggaacaccc ggtcacgtgg 27060 
cctacaccta taaaccaatc accttccttg atgccgaaga gggcgacatg tgtcttcaca 27120 
ccctggagcg agtg gacccc cbagtggaca acgaccgcta cccc tcccac ttagcctcct 27180 
tcgtgctggc ctggacgcga gccttcgtct cagagtggtc cgagtttcta tacgaggagg 2/240 
accgcggaac accgctcgag gacaggcctc tcaagtctgt atacggggac acggacagcc 27300 
ttttcgtcac cgagcgtgga caccggctca tggaaaccag aggtaagaaa cgcatcaaaa 27360 
agcatggggg aaacctggtt tttgaccccg aacggccaga gctcacctgg ctcgtggaat 27420 
gcgagaccgt ctgcggggcc tgcggcgcgg atgcctactc cccggaatcg gtatttctcg 27480 
cgcccaagct ctacgccctt aaaagtctgc actgcccctc gtgcggcgcc tcctccaagg 27540 
gcaagctgcg cgccaagggc cacgccgcgg aggggctgga ctatgacacc atggtcaaat 27600 
gctacctggc cgacgcgcag ggcgaagacc ggcagcgctt cagcaccagc aggaccagcc 27660 
tcaagcgcac cctggccagc gcgcagcccg gagcgcaccc cttcaccgtg acccagacta 27720 
cgctgacgag gaccctgcgc ccgtggaaag acatgaccct ggcccgtctg gacgagcacc 27780 
gactactgcc gtacagcgaa agccgcccca acccgcgaaa cgaggagata tgctggatcg 27840 
agatgccgta gagcacgtga ccgagctgtg ggaccgcctg gaactgcttg gtcaaacgct 27900 
caaaagcatg cctacggcgg acggcctcaa accgttgaaa aactttgctt ccttgcaaga 27960 
actgctatcg ctgggcggcg agcgccttct ggcgcatttg gtcagggaaa acatgcaagt 28020 
cagggacatg cttaacgaag tggcccccct gctcagggat gacggcagct gcagctctct 28080 
taactaccag ttgcagccgg taataggtgt gatttacggg cccaccggct gcggtaagtc 28140 
gcagc'tgctc aggaacctgc tttcttccca gctgatctcc cctaccccgg aaacggtttt 28200 
cttcatcgcc ccgcaggtag acatgatccc cccatctgaa ctcaaagcgt gggaaatgca 28260 
aatctgtgag ggtaactacg cccctgggcc ggatggaacc attataccgc agtctggcac 28320 
cctccgcccg cgctttgtaa aaatggccta tgacgatctc atcctggaac acaactatga 28380 
cgttagtgat cccagaaata tcttcgccca ggccgccgcc cgtgggccca ttgccatcat 28440 
tatggacgaa tgcatggaaa atctcggagg tcacaagggc gtctccaagt tcttccacgc 28500 
atttccttct aagctacatg acaaatttcc caagtgcacc ggatacactg tgctggtggt 28560 
tctgcacaac atgaatcccc ggagggatat ggctgggaac atagccaacc taaaaataca 28620 
gtccaagatg catctcatat ccccacgtat gcacccatcc cagcttaacc gctttgtaaa 28680 
cacttacacc aagggcctgc ccctggcaat cagcttgcta ctgaaagaca tttttaggca 28740 
ccacgcccag cgctcctgct acgactggat catctacaac accaccccgc agcatgaagc 28800 
tctgcagtgg tgctacctcc accccagaga cgggcttatg cccatgtatc tgaacatcca 28860 
gagtcacctt taccacgtcc tggaaaaaat acacaggacc ctcaacgacc gagaccgctg 28920 
gtcccgggcc taccgcgcgc gcaaaacccc taaataaaga cagcaagaca cttgcttgat 28980 
caaaatccaa acagagtctg gtttttattt atgttttaaa ccgcattggg aggggaggaa 29040 
gccttcaggg cagaaacctg ctggcgcaga tccaacagct gctgagaaac gacattaagt 29100 
tcccgggtca aagaatccaa ttgtgccaaa agagccgtca acttgtcatc gcgggcggat 29160 
gaacgggaag ctgcactgct tgcaagcggg ctcaggaaag caaagtcagt cacaatcccg 29220 
cgggcggtgg ctgcagcggc tgaagcggcg gcggaggctg cagtctccaa cggcgttcca 29280 
gacacggtct cgtaggtcaa ggtagtagag tttgcgggca ggacggggcg accatcaatg 29340 
ctggagccca tcacattctg acgcaccccg gcccatgggg gcatgcgcgt tgtcaaatat 29400 
gagctcacaa tgcttccatc aaacgagttg gcgctcatgg cggcggctgc tgcaaaacag 29460 
atacaaaact acatgagacc cccaccttat atattctttc ccacccttaa gccccgccca 29520 
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tcgatggcaa acagctatta tgggtattat 
tcagtgtcgc tgatttgtat tgtctgaagt 
taatacgata cctgcgtcat aattgattat 
tgtgatatgt agatgataat cattatcact 
tacggggcgg cgacctcgcg ggttttcgct 
tccgttcttc ttcgtcataa cttaatgttt 
aacgacaggt gctgaaagcg aggctttttg 
ccgtggaatg aacaatggaa gttaacggat 
ccaggaaccg taaaaaggcc gcgttgctgg 
agcatcacaa aaatcaacgc tcaagtcaga 
accaggcgtt tccccctgga agctccctcg 
ccggatacct gtccgccttt ctcccttcgg 
gtaggtatct cagttcggtg taggtcgttc 
ccgttcagcc cgaccgctgc gccttatccg 
gacacgactt atcgccactg gcagcagcca 
taggcggtgc tacagagttc ttgaagtggt 
tatttggtat ctgcgctctg ccaaagccag 
gatccggcaa acaaaccacc gctggtagcg 
cgcgcagaaa aaaaggatct caagaagatc 
ag tggaacga aaactcacgt taagggcittt 
cctagatcct tttaaattaa aaatgaagtt 
cttggtctga cagttaccaa tgcttaatca 
ttcgttcatc catagttgcc tgactccccg 
taccatccgg ccccagtgct gcaatgatac 
tatcagcaat aaaccagcca gccggaagtg 
ccgcctccat ccagtctatt agttgttgcc 
atagttttcg caacgttgtt gccattgcta 
gtatggcttc attcagctcc ggttcccaac 
tgtgcaaaaa agcggttagc tccttcggtc 
cagtgttatc actcatggtt atggcagcac 
taagatgctt ttctgtgact ggtgagtatt 
catagcagaa ctttaaaagt gctcatcatt 
aggatcttac cgctgttgag atccagttcg 
tctgcatctt ttactttcac cagcgtttct 
gcaaaaaagg gaataagggc gacacggaaa 
tattattgaa gcatttatca gggttattgt 



gggtgctagc gacatgaggt tgccccgtat 29580 
tgtttttacg ttaagttgat gcagatcaat 29640 
ttgacgtggt ttgatggcct ccacgcacgt 29700 
ttacgggtcc tttccggtga tccgacaggt 29760 
atttatgaaa attttccggt ttaaggcgtt 29820 
ttatttaaaa taccctctga aaagaaagga 29880 
gcctctgtcg tttcctttct ctgtttttgt 29940 
ccaggccgcg agcaaaaggc cagcaaaagg 30000 
cgtttttcca taggctccgc ccccctgacg 30060 
ggtggcgaaa cccgacagga ctataaagat 30120 
tgcgctctcc tgttccgacc ctgccgctta 30180 
gaagcgtggc gctttctcat agctcacgct 30240 
gctccaagct gggctgtgtg cacgaacccc 30300 
gtaactatcg tcttgagtcc aacccggtaa 30360 
ctggtaacag gattagcaga gcgaggtatg 30420 
ggcctaacta cggctacact agaagaacag 30480 
ttaccttcgg aaaaagagtt ggtagctctt 30540 
gtggtttttt tgtttgcaag cagcagatta 30600 
ctttgatctt ttctacgggg tctgacgctc 30660 
fcggfceaccag attatcaaaa aggatcfctca 30720 
ttaaatcaat ctaaagtata tatgagtaaa 30780 
gtgaggcacc tatcbcagcg atctgtctat 30840 
tagtgtagat aactacgata cgggagggct 30900 
cgcgtgaccc acgctcaccg gctcctgatt 30960 
ccgagcgcag aagtggtcct gcaactttat 31020 
gggaagctag agtaagtagt tcgccagtta 31080 
caggcatcgt ggtgtcacgc tcgtcgtttg 31140 
gatcaaggcg agttacatga tcccccatgt 31200 
ctccgatagt tgtcagaagt aagttggccg 31260 
tgcataattc tcttactgtc atgccatccg 31320 
caaccaagaa tacgggataa taccgcgcca 31380 
gggaaacgtt cttcggggcg aaaactctca 31440 
atgtaaccca ctcgcgcacc caagtgatct 31500 
gggtgagcaa aaacaggaag gcaaaatgcc 31560 
tgttgaatac tcatactttt cctttttcaa 31620 
ctcatcagcg gatacatatt tg 31672 



<210> 4 
<211> 30365 
<212> DNA • 

<213> Artificial Sequence 
<220> 

<223> derived from Adenovirus 
<400> 4 

gaatcggcca gcgcgaatta actataacgg 
cttattttgg attgaagcca atatgataat 
gcgtgggaac ggggcgggtg acgtaggttt 
ttgtagtttt tttaaaatgg gaagttacgt 
gaagttgtgg gttttttggc tttcgtttct 
gttttttgtg gactttaacc gttacgtcat 
ttggcccttt ttacactgtg actgattgag 
aggttttttt actggtaagg ctgactgtta 
ttctggagcg ggagggtgct attttgccta 
agttatggcg cgccagatct gtttgtcacg 



tcctaaggta gcgtcatcat cataatatac 60 
gagggggtgg agtttgtgac gtggcgcggg 120 
tagggcggag taacttgcat gtattgggaa 180 
atcgtgggaa aacggaagtg aagatttgag 240 
gggcgtaggt tcgcgtgcgg ttttctgggt 300 
tttttagtcc tatatatact cgctctgtac 360 
ctggtgccgt gtcgagtggt gttttttaat 420 
tggctgccgc tgtggaagcg ctgtatgttg 480 
ggataacttc gtataatgta tgctatacga 540 
cccgcacctg gttttgcttc aggaaatatg 600 
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actacgtccg gcgttccatt tggcatgaca 
cgcactccgt acagtaggga tcgcctacct 
ctggaggatc atccgctgct gcccgaatgt 
gtgcgaggtc ttccctgcag tgtgggattt 
gatatggttc tgacgcggga ggagcttgta 
tgttgtgcca acattgatat catgacgagc 
ctccactgtc attgttccag tcccggttcc 
gccagctggt ttaggatggt ggtggatggc 
cgggaggtgg tgaattacaa catgccaaaa 
aggggtcgcc acttaatcta cctgcgcttg 
cccgccatga gctttggata cagcgccttg 
ctgtgctgca gttactgtgc tgatttaagt 
acaaggcgtc tcatgctgcg ggcggtgcga 
tattcctgca ggacggagcg gcggcggcag 
cgccctatcc tgatgcacga ttatgactct 
ccgcccgttg agcaaccgca agttggacag 
atgaacttaa gcgagctgcc cggggagttt 
caggaaaccg tgtggaatat aacacctaag 
tttaaggcca gccggggaga aaggactgtg 
ttgaataefea gggfefecfegfcg agtttgatta 
ggtggggcta tactactgaa tgaaaaatga 
acgttgaaac ataacatgca acaggttcac 
ggtgtaagag ttggtagcaa aagtttcagt 
aaaagacata gagtaagtgc ttacctcgct 
tgtaatggta agtatcatag gtttagtttt 
gttattttta gcagtttgac tttgggtttt 
aactgcattt gtgtatggat ttgcattagt 
ttttttaagt gaggagttct ccattagaac 
aacacttgca acggtgcctg tcatggatga 
agtagctagt acttgactcc cacattttgt 
tgaatgaatt ctgcagttag gagatgggtc 
atttttgttt cctattgtaa tggcccctga 
agtaatcatg gcaccgtttt cattgtaatc 
gttgatatct ggagactcag atgtgtttgt 
agctatggca gtattatcaa agtttagtcc 
tttagtattg tttgatgcat taaaaaggta 
atgagatgca ttaatataca ggggtccctg 
atcaaatggg taatccacat ctagaattaa 
cgttttaatt tccatgttgt ttgatgaatc 
aagggagttt tgttcaacgg tgacacctgg 
ttgtgctact tgcaaaggac cgcttatttt 
aggatcttcc atgttaatgc ccaagctacc 
tacagtaagg gtgtcgctgt cactgccaga 
tccatctgac actgtaatgg gccctttagt 
cagtggggct tgtgactgta cgctaagagc 
tgccactgtt agggcgcctg aggtaattgt 
tgactttgtt tttttaagtg gctgagtaac 
ggccttgtct agggtaagac cgctgcccat 
caaaggttcg gagacgcgta gagagagaac 
tgaaacaaat ggaggggtaa gaaagggcac 
atacacgggg ttgaaggtgt cttcagacgg 
gatagtgggt gcggagggac aagaacatga 
aaagtttgca gctaaaaggc ggctgagata 
gctgaataag ctggacaaag atttgctgac 
agccgtccgc cgagtctttc accgcgtcaa 
cgggaggtcc agaaaagggg ttgaagtaaa 
gagttccaat gcctccggag cgcggctccg 
gacggggcgt aaatgaagag cggccagcgc 



ctacgaccaa cacgatctcg gttgtctcgg 660 
ccttttgaga cagagacccg cgctaccata 720 
aacactttga caatgcacaa cgtgagttac 780" 
acgctgattc aggaatgggt tgttccctgg 840 
atcctgagga agtgtatgca cgtgtgcctg 900 
atgatgatcc atggttacga gtcctgggct 960 
ctgcagtgca tagccggcgg gcaggttttg 1020 
gccatgttta atcagaggtt tatatggtac 1080 
gaggtaatgt ttatgtccag cgtgtttatg 1140 
tggtatgatg gccacgtggg ttctgtggtc 1200 
cactgtggga ttttgaacaa tattgtggtg 1260 
gagatcaggg tgcgctgcfcg tgcccggagg 1320 
atcatcgctg aggagaccac tgccatgttg 1380 
cagtttattc gcgcgctgct gcagcaccac 1440 
acccccatgt aggcgtggac ttccccttcg 1500 
cagcctgtgg ctcagcagct ggacagcgac 1560 
attaatatca ctgatgagcg tttggctcga 1620 
aatatgtctg ttacccatga tatgatgctt 1680 
tactctgtgt gttgggaggg aggtggcagg 1740 
aggtacggtg atcaatataa gctatgtgg t— 1800 
cttgaaattt tctgcaattg aaaaataaac 1860 
gattctttat tcctgggcaa tgtaggagaa 1920 
ggtgtatttt ccactttccc aggaccatgt 1980 
agtttctgtg gattcactag tgccattaag 2040 
atcaccatgc aagtaaactt gactgacaat 2100 
tggataggct agaaggttag gcataaatcc 2160 
tgagttccca tttctaaagt tccagtaatg 2220 
accgttttgg tcaaatctaa ggaatatact 2280 
aagatctcca gatacagcca aagcagctac 2340 
aagaaccaaa gtaaatttgc agtcattatc 2400 
tggggttgtc cacagggtaa gtttgtcatc 2460 
gttgtcaaag cttaaacccg ctccaagttt 2520 
aatgccagag ccaattttag tttttattgg 2580 
atcaaactcc agaccctttc ctgcatttat 2640 
actggatttt tttatgctaa cttccagttt 2700 
taggcctctg ttatagttta tgtccaagtt 2760 
ccccagttta agacgtagtt ttgtttgagc 2820 
caagttgtta tttatacgca tgccaccgcc 2880 
ataaccaata gctcctgcaa ctttggttct 2940 
tccagtaact actgttagtg tatcggagtt 3000 
aattcctatt tttccattat ttacataaat 3060 
cgtggcagta gttagcgggg gtgatgcagt 3120 
gaggggggct gatgtttgca gggctagctt 3180 
agcaatgctt agtttggagt cttgcacggt 3240 
gccgctagta actatcagag gagcggtggt 3300 
aagtggtgcg gaggtgtcca aacttatgtt 3360 
agtggttaca ttttgggagg tgaggtttcc 3420 
tttaagcgca agcatgccgt gggaggtgtc 3480 
tccaggggga ctttcttgga aaccattggg 3540 
agttggaggc ccggtttctg tgtcatatgg 3600 
tctggcgcgt ttcatctgca acaatatgaa 3660 
ggaatttgac atcccattta aactttggag 3720 
ccagagttgg gaggaaggaa aggaggtgat 3780 
tgattttaag taagtaattt attcagtcgt 3840 
agttgggaat aaactggtcc gggtagtggc 3900 
ccgaaggcac gaactcctca ataaattgta 3960 
aggacgaggt ctgcagagtt aggatcgcct 4020 
cgccgatctg aaatgtcccg tccggacgga 4080 
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gaccaagaga ggagctcacc gactcgtcgt 
ggtgagttat accctgcccg ggcgaccgca 
cccctgagtt agtcatctga acttcggcct 
gagcgggact ttcctggtac accagggcag 
gaggtgtggt ggtaatagcc gcctgttcga 
cgttgacccg ggatatcatg tggggtcccg 
cttgggcagc tccagccgca agtcccattt 
gaatttcctt gctcataatg gcgctgacga 
gatgacgtag ttttcgcgct taaatttgag 
gcgcgcagta tttgctgaag agagcctccg 
cgcttttgtg atacaggcag ctgcgggtga 
gctcttgttc ttggcccctg ctttgttgaa 
ttctaagctc gcgggtcgat acgggttcgt 
ctgctgccgc cgctgtggat ttcttgggct 
gcttctgtgt gaccgctgct gttgctgccg 
tagagatgac ggtagtaatg caggatgtta 
agaaagcggc gggcgaagga gatgttgccc 
ttcttgtgcc cgcgccacga gcggtagcct 
ggcggctgct tagacttacc ggccctggtt 
cgaacaggca gtgccggcgg cycctgagga 
gccaatttct ggggcgccgg cgaggggaat 
acctcttcgg cctcggaagc ttcgtctagg 
tcgtccaaaa cctcctctgc ctgactgtcc 
ggcggcagct gcagcttctt tttgggfcgcc 
atagggctgc ggcggcgggg ggattgggtt 
aaccccccgt ccctttcgta gcagaaactc 
gccaaggatg tggccctggg taatgacgca 
ttggtcttcg tagaacctaa tctcgtgggc 
gtaagccgac gtccacagcc ccggagtgag 
aggcgaggga ccctgcagct caaaggtacc 
attgcagacc agggagcggt gcggggtgca 
gccgtcaccg ctcacgtctt ccatgatgtc 
cagaaggtag cagtgacccc aaagcggcgg 
gtcgctagga agcgcacagc aggtggcggg 
cctaaagttt tgcaacatgc tttgactggt 
aagcaggcgt tcggggaaga taatgtccgc 
ggccgtccat aggtccttca agttttgctt 
ctcctccagg cattgctgcc acacgcccat 
gtaaacgcag tcgcggacgt agtcgcggcg 
gttttgcccg aggcggtttt cgtgcaaaat 
cacgttggaa attttgcagg cctggcgcac 
ttcctctagc ttgcgctgca tctccgggtc 
cacggtaaca agcactgcgg ccatcattag 
gcgcgtctca agccagcgcg ccagctgctc 
ttgttcttgc aagtttgcat ccctctccag 
catgactgtg ctcataacct tggggggtag 
ctcgatgctg cgtttcagca cggctaggcg 
tccacagtga ctttcatttt cgctgttttc 
gtcgcgtcca agaccctcaa agatttttgg 
gacagcgccc tgccgcaagg ccagctgctt 
taggggtatc ttgcagtttt ggaaaaagat 
aaatacgggg tagaagttga ggcgcgggtt 
ggggggtacg cgcggtgaga acaggtggcg 
gaggggcaca tcgctgcgct cttgcaacgc 
atgcttcaac agcacgtcgt ctcccacatc 
cccgacttgt tcctcgtttg cctctgcgtc 
tactgagcga tcctcgtcgt cttcgcttac 
ctcctcctca agcgggggtg cctcgacggg 



tgagctgaat acctcgccct ctgattttca 4140 
ccctgtgacg aaagccgccc gcaagctgcg 4200 
gggcgtctct gggaagtacc acagtggtgg 4260 
cgggccaact acggggatta aggttattac 4320 
ggagaattcg gtttcggtgg gcgcggattc 4380 
cgctcatgta gtttattcgg gttgagtagt 4440 
gtggctggta actccacatg tagggcgtgg 4500 
caggtgctgg cgccgggtgt ggccgctgga 4560 
aaagggcgcg aaactagtcc ttaagagtca 4620 
cgtcttccag cgtgcgccga agctgatctt 4680 
gggagcgcag agacctgttt tttattttca 4740 
atatagcata cagagtggga aaaatcctat 4800 
tgggcgccag acgcagcgct cctcctcctg 4860 
ttgtcagagt cttgctatcc ggtcgccttt 4920 
ctgccgctgc cgccggtgca gtaggggctg 4980 
cgggggaagg ccacgccgtg atggtagaga 5040 
ccacagtctt gcaagcaagc aactatggcg 5100 
tggcgctgtt gttgctcttg ggctaacggc 5160 
ccagtggtgt cccatctacg gttgggtcgg 5220 
gCggaggttg tagcgatgct gggaacggtt "5280 
gcgaccgagg gtgacggtgt ttcgtctgac 5340 
ctgtcccagt cttccatcat ctcctcctcc 5400 
cagtattcct cctcgtccgt gggtggcggc 5460 
atcctgggaa gcaagggccc gcggctgctg 5520 
gagctcctcg ccggactggg ggtccaggta 5580 
ttggcgggct ttgttgatgg cttgcaattg 5640 
ggcggtaagc tccgcatttg gcgggcggga 5700 
gtggtagtcc tcaggtacaa atttgcgaag 5760 
tttcaacccc ggagccgcgg acttttcgtc 5820 
gataatttga ctttcgctaa gcagttgcga 5880 
taggttgcag cgacagtgac actccagtag 5940 
ggagtggtag gcaaggtagt tggctagctg 6000 
agggcattca cggtacttaa tgggcacaaa 6060 
cagaattcct gaacgctcta ggataaagtt 6120 
gaagtctggc agaccctgtt gcagggtttt 6180 
caggtgcgcg gccacggagc gctcgttgaa 6240 
tagcagcttc tgcagctcct ttaggttgcg 6300 
ggccgtttgc caggtgtagc acagaaataa 6360 
cgcctcgccc ttgagcgtgg aatgaagcac 6420 
tccaaggtag gagaccaggt tgcagagctc 6480 
gtagccctgg cgaaaggtgt agtgcaacgt 6540 
agcaaagaac cgctgcatgc actcaagctc 6600 
cttgcgtcgc tcctccaagt cggcaggctc 6660 
atcgccaact gcgggtaggc cctcctcggt 6720 
gggtcgtgca cggcgcacga tcagctcgct 6780 
gttaagtgcc gggtaggcaa agtgggtgac 6840 
cgcgttgtca ccctcaagtt ccaccagcac 6900 
ttgttgcaga gcgtttgccg cgcgtttctc 6960 
cacttcgtcg agcgaggcga tatcaggtat 7020 
gtccgctcgg ctgcggttgg cacggcagga 7080 
gtgataggtg gcaagcacct ctggcacggc 7140 
gggctcgcat gtgccgtttt cttggcgttt 7200 
ttcgtaggca aggctgacat ccgctatggc 7260 
gtcgcagata atggcgcact ggcgctgcag 7320 
taggtagtcg ccatgccttt ggtccccccg 7380 
gtcctggtct tgctttttat cctctgttgg 7440 
aaaacctggg tcctgctcga taatcacttc 7500 
gaaggtggta ggcgcgttgg cggcatcggt 7560 
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ggaggcggtg gtggcgaact caaagggggc 
catgatcttt ttctgcctat aggagaagga 
aaccaccccc gagcgcggac gcggtgcggc 
gtccccgtcg ccgtcgccgc cgcctccccg 
tctcgagtcc gaggacgaag aagactcgtc 
cccgcggcca tcgacctcga cggcggattt 
cccctctccc aagcccgagc gcccgccatc 
aagagaagat gtggcgctac aaatggtggg 
cggcaaggga ggtaagcgca cggtgcggcg 
tatgcggacg caagaggaaa aggaagagtc 
aaacccgctg agcctgccga tcgtgtctgc 
gttgatggac aagtaccacg tggataacga 
ccaagtggaa gctctggcgg ccgtatgcaa 
gcagctgacc ttcaccagca acaagacctt 
gtacctgcag tcgtttgcag aggtaaccta 
gtggctgcac cgctgcgctg agatcgaagg 
gataaataag gagcacgtga ttgaaatgga 
gaaggagcag tctagcaagg ccaagatcgt 
gatctccaac accgacgcaa ggtgctgcgt 
-fe-fcccggcasg fccfefcgcggca ~fcgtfccfctcfcc 
gcagatcaag gctttcatgc aggcgctgfca 
tctgatgcca ctacggtgcg agtgcaactc 
gcagctacca aagttgactc cgttcgccct 
gatctccgac aagagcgtgc tggccagcgt 
ctgcaaccct gtgtatcgca actcgcgcgc 
gatatcggcg cccgacctgc taaacgcgtt 
cttcaccgag ctgccgcgga tggttgtgcc 
tcgcaacgtg tccctgccag tggcgcatag 
aacggcgcag acggcaaggg tggggggtaa 
atttgccttt attgaaagtg tctcctagta 
aaagaagtgg cgctcctaat ctgcgcactg 
aggaagctgt agagctgttc ctggttgcga 
agcatggagt tgggtacccc ggtaataagg 
tggggccagt tggcaaaggc gtggagaaac 
ttgggcccct gcacgctttg ggtggacttt 
gcaatggcgc tacggcgcag gagtgactcg 
tggtcagaaa agccaaaggg ctcaaagagg 
aaggccatcc agtgtacgcc cccagtctcg 
agcttgtgtg gagaaacaaa gcctggaaag 
ccacaaccaa gatctttgac aatggctttc 
gctgttgttg atgttgcttg cttcttttat 
cgcaggtaca cggtctcgat gacgccgcgg 
acttcaaaca aaacataaag aagggtgggc 
tctagcgcgt gggcggagtt ggcgtagaga 
atggacataa agttactgga gaatgggatg 
ttctgggtaa tactgtcaac cgcggttttg 
taagcctgtc cctcgcgcat ggtgggagcg 
tggtgaagaa ttccaacctg ctgatactcc 
ctcatgggct ggaagtttct gaagaacgag 
tagaagccct ggtagccaat attgtagttg 
gtcatgttgc actgagctac gttgtagccc 
tcattgggag taagcaggcg gtcgttgccc 
gccaccttct taaaggtgtg attaagatag 
gagtaggtgt agtaagggtc gtagcctgat 
cgtgtgaagg cccaaccgcg aaatgctgcc 
ttggtggcgt tggcgggtat ggggtatagc 
gactggtcgt tggtgtcatt tctgagcatg 
atggggaaga aggtggcgta aagacaaatg 



ggttaggctg tcctccttct cgactgactc 7620 
aatggccagt cgggaagagg agcagcgcga 7680 
gcgacgtcca ccaaccatgg aggacgtgtc 7740 
cgcgccccca aaaaagcggc tgaggcggcg 7800 
acaagatgcg ctggtgccgc gcacacccag 7860 
ggccattgcg tccaaaaaga aaaagaagcg 7920 
cccagaggtg atcgtggaca gcgaggaaga 7980 
tttcagcaac ccaccggtgc taatcaagca 8040 
gctgaatgaa gacgacccag tggcgcgggg 8100 
cagtgaagcg gaaagtgaaa gcacggtgat 8160 
gtgggagaag ggcatggagg ctgcgcgcgc 8220 
tctaaaggca aacttcaagc tactgcctga 8280 
gacctggcta aacgaggagc accgcgggtt 8340 
tgtgacgatg atggggcgat tcctgcaggc 8400 
caagcaccac gagcccacgg gctgcgcgtt 8460 
cgagcttaag tgtctacacg ggagcattat 8520 
tgtgacgagc gaaaacgggc agcgcgcgct 8580 
gaagaaccgg tggggccgaa atgtggtgca 8640 
gcatgacgcg gcctgtccgg ccaatcagtt 8700 
tgaaggcgca -aaggctcagg fcggctirtbaa 8760 
tcctaacgcc cagaccgggc acggtcacct 8820 
aaagcctggg catgcaccct ttttgggaag 8880 
gagcaacgcg gaggacctgg acgcggatct 8940 
gcaccacccg gcgctgatag tgttccagtg 9000 
gcagggcgga ggccccaact gcgacttcaa 9060 
ggtgatggtg cgcagcctgt ggagtgaaaa 9120 
tgagtttaag tggagcacta aacaccagta 9180 
cgatgcgcgg cagaacccct ttgattttta 9240 
ataatcaccc gagagtgtac aaataaaaac 9300 
cattattttt acatgttttt caagtgacaa 9360 
tggctgcgga agtagggcga gtggcgctcc 9420 
cgcagggtgg gctgtacctg gggactgtta 9480 
ttcatggtgg ggttgtgatc catgggagtt 9540 
atgcagcaga atagtccaca ggcggccgag 9600 
tccagcgtta tacagcggtc gggggaagaa 9660 
tactcaaact ggtaaacctg cttgagtcgt 9720 
tagcatgttt ttgagcgcgg gttccaggca 9780 
cgaccggccg tattgactat ggcgcaggcg 9840 
cgcttgtcat aggtgcccaa aaaatatggc 9900 
agttcctgct cactggagcc catggcggca 9960 
gttgtggcgt tgccggccga gaagggcgtg 10020 
tgcggctggt gcacacggac cacgtcaaag 10080 
tcgtccatgg gatccacctc aaaagtcatg 10140 
aggttttggc ccaggtctgt gagtgcgccc 10200 
cgccaaaggg tgcgatcgca aagaaacttt 10260 
cctattagtg ggtagggcac gttggcgggg 10320 
aggtagccta cgaatcctga gttgttatgc 10380 
ttgtatttag tatcgtcaac cacttgccgg 10440 
tacatgcggt ccttgtagct ttctggaatg 10500 
gccaacatct gcaccaggaa ccagtccttg 10560 
tccccgtcaa ctgagcgttt aatctcaaac 10620 
ggccagctaa cagaagagtc aaaggtaatg 10680 
aaggttccgt caaggtatgg tatggagcca 10740 
cccagggaag gggtttcctt tgtcttcaag 10800 
cagttgcgcg atgggatgga gatgggcacg 10860 
atgttggcgg cggaaaggta gtcattaaag 10920 
gcttccagcg tggaggccgt gttgtgggcc 10980 
ctgtcaaact taatgctagc cccgtcaact 11040 



-24- 



WO 02/20814 



PCT/US01/27682 



ctaagatcgt ttcccagaga gctctgcaga 
tcatatgtat atgagcctgg caggaggagg 
acctgaatgt gaaagggcac gtagcggccg 
cccgcattgc ggtggtggtt aaagggatta 
gccccaaggt taatgtagca gtctacaagc 
tcgtaggtgt tggggttgtc agatatttcc 
ggcaggtaca gcgcaatatt ggagtaaagg 
atttccatgg caaagttgtt acccactcct 
tttgtccatg tagtatctcc attatcgcct 
taggtgtcag ttaccccaat acccccaaga 
gttccatggt tttcaatgat tctaacatct 
atagaaaaat atctggttct atcacctatg 
gtgtttctgt cttgcaaatc taccacggca 
cccatgttgc cagtgctgtt ataatacatt 
taattgggtc tgtttggcat agattgttga 
tttccaggtt tgtaagacag atgtgtgtct 
aaaaccactt ttggtttagt agcattgcct 
aagaattgca agtcaacctt tggaagaggc 
tgaccaccaa aaggatttgt aggcctggca 
gtttfetttaa geaeteteec tectgccgca 
ccaatttgag gttctggttg ataggaagga 
gcattgtctg atcctatttg tagcccgctt 
tgggcataga catgtgtttt cttagtagcc 
tcttcttcat cttcatcttc ctcttcttca 
gtttgttccc actcacagga gttaggagcg 
ccggagtagg gcttaaaagt aggccccctg 
gaagccatat caagcacacg gttgtcaccc 
tacgcggtat cctcgcggtc cacagggatg 
gtggttacgt cgtgcgtagg tgccaccgtg 
tacgtctcgg tggcgcgggc aaactgcacc 
tggcccgaga tgtgcatgta agaccactgc 
agcgggcgca cggcggctca gcagctcctc 
tacgacacgt tagctattta gaagcatcgt 
acgatgctgt tcagtgtgct ttgccagttg 
ccgctggcgg cacggcgcag ggacgcgcgg 
cctggcagca caggtttctg ctgggtgtca 
gtgtgctggc ccagcactcc ggtagccatg 
tgctttagtg cctcctcgta cgagggaggc 
ctgtcgggcc gcggacgctt ttcgccacgc 
ggaggctcct ctacgggagg gcggggatca 
tggttggcca ggtccaccac cccgctaatg 
tggaaatttt gctctttcaa cttgtccctc 
ttgctgccat agttcttaat ggtggaaccg 
cagctgaagg cgccaccgct catattgctg 
gggcgcgagc cgtgtcgcgg ggccagagac 
agttacagga ccaagcgagc gtgagactcc 
acttgttttt aatcagtgtc tctgcgcctg 
gccaatcgcc gcggcgatca gtggaataag 
gtgcgacgcg cgccgccgcc ggtggtgcgc 
ggccatgccc ctcctacggt gcattcttcc 
gcaggtgagg gccatatctg caagaaccac 
gtagcgcgct gttggcagca ccagggtcct 
aatcggggcc agcacgggct ggcgacggcg 
gcgtcgggta gtcgctcgtc ttctggggcg 
cgcgatggaa ggatgtaggg catattcggg 
ggaacggcgc gggcgccggg gggctgaaac 
cgtagaggtt ttggacgcgg ccgcagcggc 
ggcaaccggg gacgtttgtg tctccatgcc 



accatgttaa catccttcct gaagttccat 11100 
aggtttttaa tggcaaaaaa cttttggggc 11160 
tttcccaaca acatggagcg ataacggagg 11220 
acgttgtcca tgtagtccag agaccagcgc 11280 
ccgggagcca ccactcgctt gttcatgtag 11340 
acattggtgg ggttgtattt tagpttgtct 11400 
aaatttctcc ataggttggc atttaggtta 11460 
atttcattac gtgttgcaaa agtttcatct 11520 
gagccattgc cattagcctt aatagcttga 11580 
ggaaaacaat aatttggcaa ttcatcctca 11640 
ggatcatagc tgtctacagc ctgattccac 11700 
gaatcaagca agagttgata ggacagctct 11760 
tttagctgcg atgcctgacc agcaagaaca 11820 
aggccaataa aattgtccct gaaagcaatg 11880 
cccaacatag ctttagaatt ttcatcacct 11940 
ggggtttcca tatttacatc ttcactgtac 12000 
tgccggtcgt tcaaagaggt agtatttgag 12060 
accccttttt catccggaac cagaacggat 12120 
taagatccat agcatggttt catgggagtt 12180 
ttagcatcag cttcgttcca ctgagattcg 1-2 2 4G 
tctgcgtata caggtttagc ttgtgtttct. 12300 
tttgtaattg tttctccaga caaaggagcc 12360 
tgatctcgag cgttttgctc ttcttcttcc 12420 
tcctcggcaa ctgcccggcc gctatcttcg 12480 
cccttgggag ctagagcgtt gtaggcagtg 12540 
tccagcacgc cgcggatgtc aaagtacgtg 12600 
acagccaggg tgaaccgcgc tttgtacgag 12660 
aaccgcagcg tcaaacgctg ggaccggtct 12720 
gggtttctaa acttgttatt caggctgaag 12780 
agcccggggc tcaggtactc cgaggcgtcc 12840 
ggcatcatcg aaggggtagc catcttggaa 12900 
tggcggcgac atggacgcat acatgacaca 12960 
cggcgcttca gggattgcac ccccagaccc 13020 
ccactggcta cgggccgcat cgatcgcgga 13080 
ctagggcggg ttacaacaac ggcggacggc 13140 
gcggggggag gcaggtccag cgttacaggt 13200 
ggcgcgatgg gacgggtggt gggcaggcct 13260 
tcatctattt gcgtcaccag agtttcttcc 13320 
ccctctggag acactgtctc cacggccggt 13380 
agcttactgt taatcttatt ttgcactgcc 13440 
ccagaggcca ggccatctac caccttttgt 13500 
agcatctggc ctgtgctgct gttccaggcc 13560 
aaatttttaa tgccgctcca cagcgagccc 13620 
gtgccgatat cttgccagtt tcccatgaac 13680 
gcaaagttga tgtcttccat tctacaaaat 13740 
agacttttta ttttgatttt tccacatgca 13800 
caaggccacg gatgcaattc cgggcacggc 13860 
gaggggcagg ataccgccgc gcatgcgacg 13920 
acgacgcatg ccgcccgtca ggccgtggcc 13980 
tcggaatccc ggcaccggga aacggaggcg 14040 
aaagaccggc ttttaaacga tgctggggtg 14100 
gcctccttcg cgagccaccc tgcgcacgga 14160 
acggcggcgg cgggttccag tggtggttcg 14220 
gtaggtgtag ccacgatagc cgggggtagg 14280 
cagtagtgcg ctggcggtgc cgtacttcct 14340 
gcgaaacatc cacgggtccg tttgcacctc 14400 
cgcctgcacc gcggcatctg ccaccgccga 14460 
ctctgtggca gtggcaatac tagtgctact 14520 
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ggtggtgggt atctgaacgt ccacggtctg 
tggccgcacg cggacctcgg gctccagccc 
ttccagtcgc tggcgcttgg gtaccatcag 
gcgctttagg ccgcgctttt cttcggacgg 
cacgggcttt aggctaggtg ttgggttgcc 
atgccgcttt ccgtaggcaa actccccgag 
gtcgccgtac acctcatcat acacgcgctt 
aaaaactacg gtggtgccgg gtcgcaaaac 
ccgccgcctg ggcgcggttg cgtgcagcag 
atcatctttc tttttctttt tgacccgctt 
cttcttcggg gggccataga tctccggcgc 
gcgcttggac atagcttcgt tgcgcgccgc 
taagtagttt tttcttgcaa tctagttgcg 
gccgctaacc gagtcgcgca cccagtacac 
aatggccgcg gctgctgcgg cggccgctcg 
acccgcggcc agccttcgag cggcccgcat 
gttaagcagg gccgccgccg cgcgttgggc 
tgctacgcgc ctccgccgtc tcttcatttt 
ctgaatggcc gcgtccactg tggacactgg 

CtCCaCC5.CC yCy tCdatyy CCftCotCCfaC 

cgcgccccag ggcgcgcggt agtgcccgcg 
ctttgccccg ccaaacatct tgcttgggaa 
cgatataagg atggacatgt ttgctcaaaa 
tatgcccagg gccttgtaaa cgtaggggca 
tcgctggact cctccgatgc tgttgcgcag 
aacgttttca ctgacggtgg tgatggtggg 
gggaaagcga ttgaacacgt gggtcagaga 
ggcctggtcg ttgtagaagc tcttggagtg 
aaagttgctg atctggctcg tggagcggaa 
cgaccagtag acctgctccg agccgcaggt 
gcggatcccg gtctgagggt cgccgtagtt 
gaaggtgctg tcattgctta ttaggttgta 
tttgatcacc ggtttcttct gaggcttctc 
ttctgccgct gcctcggcct cagcgcgctt 
gcgaatggca tgatcgttca tgtcctccac 
gttctcttcc gcgccgctgc cactgttgtt 
ggtgtcatct tttaagcttg cctggtaggc 
accctccagg tcatcgtagg tgatcctaaa 
gcccaacaag ttgctcaggc ggctgtgggt 
gatgtctgga tggaaggctt cgtttgtata 
gtcaaacccc agtctgaagt tgcgggtgtc 
cccgttctgc ctgcccactt tcaagtagtg 
ggtcatggtc tcggagtagt tgccctcggg 
ctccacctgt ttgtccttag taagcgagcg 
ggtaaacatg aactcgttca catttggcat 
gccccagtgc gaacggtcgt caagattgat 
gtcattgttt tgaatgaccg tggttagaaa 
tgccacatcc gttgacttgt tgtccacaag 
caactcagag taacggatgc tgtttctccc 
aaacggcggg tccaggggag catcgaaggg 
gctcaccacg ctctcgtagg agggaggagg 
actaagggga atacaagaaa accaacgctc 
tgcatcatgc tttttttttt tttttttaaa 
caaacgggtt gccactccct cccaaatcca 
cctcccacac cagaccccgc tgacggtcgt 
cgggcacatc cctgtgctcc tgcgcatacg 
tctctatccc gttgttggga aatgccggag 
gcgagttgtt taggtactcc tcctcgccca 



cacgcccagt cccggtgcca cctgcttgat 14580 
aggctccacg gtcatttttt ccaagacatc 14640 
ctgcacggtg ggtgccaagt caccagactc 14700 
tgcaagcgtg ggcagcacct gctgcagtgt 14760 
ctcgtccagc ggcaacgcca acatgtcctt 14820 
gcgctcgttg gcctgctcaa gcaggtcctc 14880 
gtaggtgcgg gtggagcgct caccgggcgt 14940 
acgtcttacg cgtcgacctt tccactgtac 15000 
ttccacctcg tcgtcaagtt catcatcatc 15060 
tagctttcgg ggcttgtaat cctgctcttc 15120 
gatgacctgg agcatctctt ctttgatttt 15180 
cgccgctgga tacatacaac agtacgagtc 15240 
cggggggcgg gtgcgcacgg gcacgcgcag 15300 
gttgcccctg cgaccctgag tcatagcact 15360 
tcgcctggac ctggggggca cagtgacaat 15420 
ggccgcccgt cggccggtgc gacgtgcgcg 15480 
ggcagtgccg ggtcggcggc ggtggcgacg 15540 
agcataacgc cgggctccgc gcaccacggt 15600 
tggcggcgtg ggcgtgtagt tgcgcgcctc 15660 
ggtggtgcgc ccagtgcyyc cgcgLirfcgtg 15720 
cacgcgcact gggtgttggt cggagcgctt 15780 
gcgcaggccc cagcctgtgt tattgctggg 15840 
agtgcggctc gataggacgc gcggcgagac 15900 
ggtgcggcgt ctggcgtcag taatggtcac 15960 
cggtagcgtc ccgtgatctg tgagagcagg 16020 
ggctggcggg cgcgccaaaa tctggttctc 16080 
ggtaaactgg cggatgagct gggagtagac 16140 
cacgggcaac agctcggcgc ccaccaccgg 16200 
ggtcacgggg tcttgcatca tgtctggcaa 16260 
tacgtcagga gtgcaaagga gggtccatga 16320 
gtatgcaagg taccagctgc ggtactgggt 16380 
actgcgtttc ttgctgtcct ctgtcagggg 16440 
gacctcgggt tgcgcagcgg gggcggcagc 16500 
ctcctccgcc cgtgtggcaa aggtgtcgcc 16560 
cggctgcatt gccgcggctg ccgcgttgga 16620 
gccgccgcct gcgccatccc cgccctgttc 16680 
gtccacatcc aacagtgcgg gaatgttacc 16740 
gccctcctgg aagggttgcc gcttgcggat 16800 
gaagtccacc ccgcatcctg gcagcaaaat 16860 
taccccaggc atgacaagac cagtgactgg 16920 
aaactttacc ccgatgtcgc tttccagaac 16980 
ctccacgatc gcgttgttca taaggtctat 17040 
cagcgtgaac tccacccact catatttcag 17100 
cgacaccatc acccgcgcct taaacttatt 17160 
gttggtatgc aggatggttt tcaggtcgcc 17220 
ggtctgtgtg cttgcctccc ccgggctgta 17280 
gttgctgtgg tcgttctggt agttcaggga 17340 
gtacacacgg gtggtgtcga ataggggtgc 17400 
cccggtaggc cgcaggtacc gcggaggcac 17460 
ggaacccagc gccgccgcca ctggcgccgc 17520 
accttcctca tacatcgccg cgcgctgcat 17580 
ggtgccatgg ccttggtgag ttttttattt 17640 
acattctccc cagcctgggg cgaaggtgcg 17700 
ggacgctgct gtcgtctgcc gagtcatcgt 17760 
gcctttgacg acgggtgggc gggcgcgggc 17820 
tcttccatct actcatcttg tccactaggc 17880 
gcaggttctt ttcgcgctgc ggctgcagca 17940 
gcaggcgcgg gcgggtggtg cgagtgctgg 18000 
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taaaagaccc tatcaagctt ggaaatgggc 
ctagatcgga caagctgctt ggcctgcgga 
gctcgcgctg ttgcaactct agcagggtct 
tgtcgtccca gaggaatcca tcgttaccct 
cagggggcgg tagccagtgc gggttcaaga 
cggccgcgcg atgcaagtag tccattaggc 
tgcccggttc catgttgcgc gcggtcatgt 
gcatcaggtt aaggctcacg ctctgctgca 
ttaaactgtg caacgagggg atcttctgcc 
cctccaggct gcccgtgtcc tcctgcccca 
aagtatgctc gtccacatgc gcctgaccta 
ctaggtatgt gtcccgggac acgctgccac 
acaggcgcga gttgggcgtc agcaagctag 
gcacccccca cagcccctgc aagttcttga 
cttgtctact ggtctggaaa aaatagtctg 
tctcagtcac cattagccgc agtgcgctca 
gcacgttggc gggctgtgta ctcaggaagg 
cctgctgctg cgcgcgctca cgctgcgcca 
ccaggttggt ctgcacgttg ccgctgttgt 
agaccaggcc ggcctGafeeg ggecggafcgg 
ccagcacctt ctcgtgcgtg gggtttgcgc 
agagccggtt ggcctgcggc tgctgccgga 
tgatgcggtc catgacctgg cgccagtcgt 
gcagcgccgc ccgcaccgcc gggtccgttg 
cgcttagtac tcgccgtcct ctggcbcgta 
gccgacgttg ccagcgcgcg cgggtgccac 
cagggcgcgt cggcttgggg cccagcgcag 
ctctctatcg ccgctgcccg tgccagccag 
gtcgctgagc tcgcgccgcc ggctcacgct 
aacgcccagg tcgtcgctca aggtaagcac 
gatctttacc tccttgtcta tgggaacgta 
cttgcccaga ctgagcatgg aatagttaat 
gcgctcctgc accactatgc tctgcagaat 
gggctctact atgtttagca gcgcatccct 
tataaggaac agctgcgcca tgagcggctt 
aaagtcccac agatgcatca gtcctatagc 
gtggttgtta aagctttttt gaaagttaat 
taccaggtcg gcggccgcca cgtgtgcgcg 
ctcaaagtcc tcctcgcgca gcaaccgctc 
tgcgtggaac tttcgatccc gcatctcctc 
gttctgccgc ggcacgtacg cctcgcgcgt 
ctcaggagag ggcgctccta gccgcgccag 
ccgggcccgg cgccgcgggg gttcgtaatc 
tgcccctcct gacgcggtag gagaagggga 
ctcttgccgc tgctgaggag gggggcgcat 
gcaaaaaagg ggctcgtccc tgtttccgga 
ggaggcaaac ccccgttcgc cgcagtccgg 
gactcaaccc ttggaaaata accctccggc 
tttccagcct aaccgcttac gccgcgcgcg 
cgccgcgcct ggaaggaagc caaaaggagc 
ggttcgacac gcgggcggta accgcatgga 
accccggtcg tccgccatga tacccttgcg 
cgcttacagg ctctcctttt gcacggtcta 
ccagagcgtc ccgaccatgg agcacttttt 
cgactttccg cgcgcctcca ccaccgccgc 
cggatatcat cgccttatgt tggaagacct 
gcccctctac cgccagccgc cgccgcactt 
ttgcaacgac tacgtctttg actcaagggc 



tactcgcatc tgaccgcggg gccgcagcgc 18060 
agctttcctt tcgcagcgcc gcctctgcct 18120 
gcggttgcgg ggaaaacacg ctgtcgtcta 18180 
cgggcacctc aaatcccccg gtgtagaaac 18240 
tggcattggt gaaatactcg gggttcacgg 18300 
gattgataaa cggccggttt gaggcataca 18360 
ccagcgccac gctgggcgtt accccgtcgc 18420 
catagcgcaa gatgcgctcc tcctcgctgt 18480 
gccggttggt cagcaggtag ttcagggttg 18540 
gcgcgcggct gacacttgta atctcctgga 18600 
tggcctcgcg gtacagtgtc agcaagtgac 18660 
tgtccgtgaa gggcgctatt agcagcagca 18720 
acacggtcgc gcggtcgcct gtgggagccc 18780 
aagcctggct caggtttacg gtctgcaggc 18840 
gcccggactg gtacacctca ctttgcggtg 18900 
caaagttggt gtagtcctcc tgtccccgcg 18960 
cgtttagtgc aaccatggag cccaggttgc 19020 
cggcctcgcg cacatccccc accagccggt 19080 
aacgagccac gcgctgaagc agcgcgtcgt 19140 
GGGfcgtfcfcfcc ggccagcgcg tttaegatcy 19200 
gcgccgggac caccgcttcc agaattgcgg 19260 
acgcgtcagg gttacgcgca gtcagcgaca 19320 
ccgtggagtt aaggccggac ggctggctct 19380 
cgtcttgcat catctgatca gaaacatcac 19440 
ctcatcgtcc tcgtcatatt cctccacgcc 19500 
cgccagccca ggtccggccc cagctgcctc 19560 
gtcagcgccc gcgtcaaagt aggactcggc 19620 
ggccctttgc aggctgtgca tcagctcgcg 19680 
cacggccttg tggatgcgct cgttgcgata 19740 
cttcaacgcc atgcgcatgt agaacccctc 19800 
aggggtafcgg tatatcttgc gggcgtaaaa 19860 
ggcggccacc ttgtcagcca ggctcaagct 19920 
gtttatcaaa tcgagcagcc agcggccctc 19980 
gaatgcctcg ttgtccctgc tgtgctgcac 20040 
gctatttggg ttttgctcca gcgcgcttac 20100 
cacctcctcg cgcgccacaa gcgtgcgcac 20160 
ctcctggttc accgtctgct cgtacgcggt 20220 
cgcgggacta atcccggtcc gcgcgtcggg 20280 
gcggttcagg ccatgccgca actcgcgccc 20340 
gggctcctct ccctcgcggt cgcgaaacag 20400 
gtcacgcttc agctgcaccc ttgggtgtcg 20460 
gccctcgccc tcctccaagt ccaggtagtg 20520 
accatctgcc gccgcgtcag ccgcggatgt 20580 
gggtgccctg catgtctgcc gctgctcttg 20640 
ctgccgcagc accggatgca tctgggaaaa 20700 
ggaatttgca agcggggtct tgcatgacgg 20760 
ccggcccgag actcgaaccg ggggtcctgc 20820 
tacagggagc gagccactta atgctttcgc 20880 
gccagtggcc aaaaaagcta gcgcagcagc 20940 
gctcccccgt tgtctgacgt cgcacacctg 21000 
tcacggcgga cggccggatc cggggttcga 21060 
aatttatcca ccagaccacg gaagagtgcc 21120 
gagcgtcaac gactgcgcac gcctcaccgg 21180 
gccgctgcgc aacatctgga accgcgtccg 21240 
cggcatcacc tggatgtcca ggtacatcta 21300 
cgcccccgga gccccggcca ccctacgctg 21360 
tttggtggga tatcagtacc tggtgcggac 21420 
ttactcgcgt ctcaggtaca ccgagctctc 21480 
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gcagccgggt caccagaccg ttaactggtc cgttatggcc aactgcactt acaccatcaa 21540 
cacgggcgca taccaccgct ttgtggacat ggatgacttc cagtctaccc tcacgcaggt 21600 
gcagcaggcc atattagccg agcgcgttgt cgccgacctg gccctgcttc agccgatgag 21660 
gggcttcggg gtcacacgca tgggaggaag agggcgccac ctacggccaa actccgccgc 21720 
cgccgtagcg atagatgcaa gagatgcagg acaagaggaa ggagaagaag aagtgccggt 21780 
agaaaggctc atgcaagact actacaaaga cctgcgccga tgtcaaaacg aagcctgggg 21840 
catggccgac cgcctgcgca ttcagcaggc cggacccaag gacatggtgc ttctgtcgac 21900 
catccgccgt ctcaagaccg cctactttaa ttacatcatc agcagcacct ccgccagaaa 21960 
caaccccgac cgccacccgc tgccgcccgc cacggtgctc agcctacctt gcgactgtga 22020 
ctggttagac gcctttctcg agaggttttc cgatccggtc gatgcggact cgctcaggtc 22080 
cctcggtggc ggagtaccta cacaacaatt gttgagatgc atcgttagcg ccgtatccct 22140 
gccgcacggc agccccccgc caacccataa ccgggacatg acgggcggcg tcttccaact 22200 
gcgcccccgc gagaacggcc gcgccgtcac cgagaccatg cgccgtcgcc gcggggagat 22260 
gatcgagcgc tttgtcgacc gcctcccggt gcgccgtcgt cgccgccgtg tcccccctcc 22320 
cccaccgccg ccagaagaag aagaagaagg ggaggccctt atggaagagg agattgaaga 22380 
agaagaggcc cctgtagcct ttgagcgcga ggtgcgcgac actgtcgccg agctcatccg 22440 
tcttctggag gaggagttaa ccgtgtcggc gcgcaactcc cagtttttca acttcgccgt 22500 
ggacttctac gaggccatgg agcgccttga ggccttgggg gatatcaacg aatccacgtt 22560 
gcgacgctgg gttatgtact tcttcgtggc agaacacacc gccaccaccc tcaactacct 22620 
etfetcagcgc ctgcgaaact acgccgtctt cgcccggeae g by gage tea atctcgcgca 22680 
ggtggtcatg cgcgcccgcg atgecgaagg gggcgtggtc tacagccgcg tctggaacga 22740 
gggaggcetc aacgccttct cgcagctcat ggcccgcatc tccaacgacc tcgccgccac 22800 
cgtggagcga gccggacgcg gagatctcca ggaggaagag atcgagcagt teatggcega 22860 
aatcgectat caagacaact caggagacgt gcaggagatt ttgegecagg ccgccgtcaa 22920 
cgacaccgaa attgattctg tcgaactctc tttcaggttc aagctcaccg ggcccgtcgt 22980 
cttcacgcag aggegecaga ttcaggagat caaccgccgc gtcgtcgcgt tcgccagcaa 23040 
cctccgcgcg cagcaccagc tcctgcccgc gcgcggcgcc gacgtgcccc tgccccctct 23100 
cccggcgggt cccgagcccc ccctacctcc gggggcccgc ccgcgtcacc gcttttagat 23160 
gcatcatcca aggacacccc cgcggcccac cgcccgccgc gcggtaccgt agtcgcgccg 23220 
eggggatgeg gcctcttgca agtcatcgac gccgccacca accagcccct ggaaatcagg 23280 
tatcacctgg acctagcccg cgccctgacc eggctatgeg aggtaaacct gcaggagctc 23340 
ccgcctgacc tgtcgccgcg ggagctccag accatggaca gctcccatct gcgcgatgtt 23400 
gtcatcaagc tccgaccgcc gcgcgcggac atctggactt tgggctcgcg cggcgtggtg 23460 
gtccgatcca ccataactcc cctcgagcag ccagacggtc aaggacaagc agecgaagta 23520 
gaagaccacc agccaaaccc gecaggegag gggctcaaat tcccactctg cttccttgtg 23580 
cgcggtcgtc aggtcaacct cgtgcaggat gtacagcccg tgcaccgctg ccagtactgc 23640 
gcacgttttt acaaaageca gcacgagtgt tcggcccgtc gcagggactt ctactttcac 23700 
cacatcaaca gccactcctc caactggtgg egggagatec agttcttccc gateggcteg 23760 
catcctcgca ccgagcgtct ctttgtcacc tacgatgtag agacctatac ttggatgggg 23820 
gcctttggga ageagctegt gcccttcatg ctggttatga agttcggcgg agatgagect 23880 
ctggtgaccg ccgcgcgaga cctagccgtg gaccttggat gggaccgctg ggaacaagac 23940 
ccgcttacct tetactgeat caccccagaa aaaatggcca taggtcgeca gtttaggacc 24000 
tttcgcgacc acctgeaaat getaatggee cgtgacctgt ggagctcatt cgtcgcttcc 24060 
aaccctcatc ttgcagactg ggccctgtca gaacaeggge tcagctcccc tgaggagctc 24120 
acctacgagg aacttaaaaa attgccctcc atcaagggca ccccgcgcfct cttggaactt 24180 
tacatcgtgg gccacaacat caacggcttc gacgagatcg tgctcgccgc ccaggtaatt 24240 
aacaacegtt ccgaggtgcc gggacccttc cgcatcacac gcaactttat gcctcgcgcg 24300 
ggaaagatac ttttcaacga tgtcaccttc gccctgccaa acccgcgttc caaaaagege 24360 
aeggactttt tgctctggga geagggegga tgegacgaca ctgacttcaa ataccagtac 24420 
ctcaaagtca tggttaggga cacctttgcg ctcacccaca cctcgctccg gaaggccgcg 24480 
caggcatacg cgctacccgt agaaaaggga tgctgcgcct accaggccgt caaccagttc 24540 
tacatgetag gctcttaccg tteggaggee gaegggttte cgatccaaga gtactggaaa 24600 
gaecgegaag agtttgtcct caaccgcgag ctgtggaaaa aaaagggaca ggataagtat 24660 
gacatcatca aggaaaccct ggactactgc gccctagacg tgcaggtcac cgccgagctg 24720 
gtcaacaagc tgcgcgactc ctacgcctcc ttcgtgcgtg aegeggtagg tctcacagac 24780 
gccagcttca acgtcttcca gcgtccaacc atatcatcca actcacatgc catcttcagg 24840 
cagatagtct tccgagcaga gcagcccgcc cgtagcaacc tcggtcccga cctcctcgct 24900 
ccctcgcacg aactatacga ttacgtgcgc gccagcatcc gcggtggaag atgctaccct 24960 
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acatatcttg gaatactcag agagcccctc 
tccgcgctca cccaccccat gccatggggt 
gccgcccgcg catggcagca ggcgctagac 
gcgcgcctgc tgcccggggt ctttaccgtg 
gacccactac cgccattctg ttcgcgcaag 
ctacgcggag aggtagccac cagcgttgac 
gtgcacctgg tgcccgacga gcgcaccacc 
gaatacgtgc agctaaacat cgcggccaag 
ctgcgctcca tcgccaagtt gctgtccaac 
gacaacaaaa agattgtctt ttctgaccag 
gcgggccagg tgaatatcaa atcctcctcg 
gtcatgcccg cttttgagag ggagtactca 
gcggaagaga gtgaggacga acgcgccccc 
cccggtcacg tggcctacac ctataaacca 
atgtgtcttc acaccctgga gcgagtggac 
cacttagcct ccttcgtgct ggcctggacg 
ctatacgagg aggaccgcgg aacaccgctc 
gacacggaca gccttttcgt caccgagcgt 
aaacgcatca aaaagcatgg gggaaacctg 
tggctcgtgg ^aatgcg.agac cgtctgcggg 
tcggtatttc tcgcgcccaa gctctacgcc 
gcctcctcca agggcaagct gcgcgccaag 
accatggtca aatgctacct ggccgacgcg 
agcaggacca gcctcaagcg caccctggcc 
gtgacccaga ctacgctgac gaggaccctg 
ctggacgagc accgactact gccgtacagc 
atatgctgga tcgagatgcc gtagagcacg 
ttggtcaaac gctcaaaagc atgcctacgg 
cttccttgca agaactgcta tcgctgggcg 
aaaacatgca agtcagggac atgcttaacg 
gctgcagctc tcttaactac cagttgcagc 
gctgcggtaa gtcgcagctg ctcaggaacc 
cggaaacggt tttcttcatc gccccgcagg 
cgtgggaaat gcaaatctgt gagggtaact 
cgcagtctgg caccctccgc ccgcgctttg 
aacacaacta tgacgttagt gatcccagaa 
ccattgccat cattatggac gaatgcatgg 
agttcttcca cgcatttcct tctaagctac 
ctgtgctggt ggttctgcac aacatgaatc 
acctaaaaat acagtccaag atgcatctca 
accgctttgt aaacacttac accaagggcc 
acatttttag gcaccacgcc cagcgctcct 
cgcagcatga agctctgcag tggtgctacc 
atctgaacat ccagagtcac ctttaccacg 
accgagaccg ctggtcccgg gcctaccgcg 
acacttgctt gatcaaaatc caaacagagt 
gggaggggag gaagccttca gggcagaaac 
aacgacatta agttcccggg tcaaagaatc 
atcgcgggcg gatgaacggg aagctgcact 
agtcacaatc ccgcgggcgg tggctgcagc 
caacggcgtt ccagacacgg tctcgtaggt 
gcgaccatca atgctggagc ccatcacatt 
cgttgtcaaa tatgagctca caatgcttcc 
tgctgcaaaa cagatacaaa actacatgag 
taagccccgc ccatcgatgg caaacagcta 
ggttgccccg tattcagtgt cgctgatttg 
gatgcagatc aattaatacg atacctgcgt 
cctccacgca cgttgtgata tgtagatgat 



tacgtttacg acatttgcgg catgtacgcc 25020 
cccccactca acccatacga gcgcgcgctt 25080 
ttgcaaggat gcaagataga ctacttcgac 25140 
gacgcagacc ccccggacga gacgcagcta 25200 
ggcggccgcc tctgctggac caacgagcgc 25260 
cttgtcaccc tgcacaaccg cggttggcgc 25320 
gtctttcccg aatggcggtg cgttgcgcgc 25380 
gagcgcgccg atcgcgacaa aaaccaaacc 25440 
gccctctacg ggtcgtttgc caccaagctt 25500 
atggacgcgg ccaccctcaa aggcatcacc 25560 
tttttggaaa ctgacaatct tagcgcagaa 25620 
ccccaacagc tggccctcgc agacagcgat 25680 
accccctttt atagcccccc ttcaggaaca 25740 
atcaccttcc ttgatgccga agagggcgac 25800 
cccctagtgg acaacgaccg ctacccctcc 25860 
cgagccttcg tctcagagtg gtccgagttt 25920 
gaggacaggc ctctcaagtc tgtatacggg 25980 
ggacaccggc tcatggaaac cagaggtaag 26040 
gtttttgacc ccgaacggcc agagctcacc 26100 
.gcc tgcggcg cggatgeeta ctccccgg aa 26160 
cttaaaagtc tgcactgccc ctcgtgcggc 26220 
ggccacgccg cggaggggct ggactatgac 26280 
cagggcgaag accggcagcg cttcagcacc 26340 
agcgcgcagc ccggagcgca ccccttcacc 26400 
cgcccgtgga aagacatgac cctggcccgt 26460 
gaaagccgcc ccaacccgcg aaacgaggag 26520 
tgaccgagct gtgggaccgc ctggaactgc 26580 
cggacggcct caaaccgttg aaaaactttg 26640 
gcgagcgcct tctggcgcat ttggtcaggg 26700 
aagtggcccc cctgctcagg gatgacggca 26760 
cggtaatagg tgtgatttac gggcccaccg 26820 
tgctttcttc ccagctgatc tcccctaccc 26880 
tagacatgat ccccccatct gaactcaaag 26940 
acgcccctgg gccggatgga accattatac 27000 
taaaaatggc ctatgacgat ctcatcctgg 27060 
atatcttcgc ccaggccgcc gcccgtgggc 27120 
aaaatctcgg aggtcacaag ggcgtctcca 27180 
atgacaaatt tcccaagtgc accggataca 27240 
cccggaggga tatggctggg aacatagcca 27300 
tatccccacg tatgcaccca tcccagctta 27360 
tgcccctggc aatcagcttg ctactgaaag 27420 
gctacgactg gatcatctac aacaccaccc 27480 
tccaccccag agacgggctt atgcccatgt 27540 
tcctggaaaa aatacacagg accctcaacg 27600 
cgcgcaaaac ccctaaataa agacagcaag 27660 
ctggttttta tttatgtttt aaaccgcatt 27720 
ctgctggcgc agatccaaca gctgctgaga 27780 
caajttgtgcc aaaagagccg tcaacttgtc 27840 
gcttgcaagc gggctcagga aagcaaagtc 27900 
ggctgaagcg gcggcggagg ctgcagtctc 27960 
caaggtagta gagtttgcgg . gcaggacggg 28020 
ctgacgcacc ccggcccatg ggggcatgcg 28080 
atcaaacgag ttggcgctca tggcggcggc 28140 
acccccacct tatatattct ttcccaccct 28200 
ttatgggtat tatgggtgct agcgacatga 28260 
tattgtctga agttgttttt acgttaagtt 28320 
cataattgat tatttgacgt ggtttgatgg 28380 
aatcattatc actttacggg tcctttccgg 28440 
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tgatccgaca ggttacgggg cggcgacctc 
ggtttaaggc gtttccgttc ttcttcgtca 
tgaaaagaaa ggaaacgaca ggtgctgaaa 
tctctgtttt tgtccgtgga atgaacaatg 
ggccagcaaa aggccaggaa ccgtaaaaag 
cgcccccctg acgagcatca caaaaatcaa 
ggactataaa gataccaggc gtttccccct 
accctgccgc ttaccggata cctgtccgcc 
catagctcac gctgtaggta tctcagttcg 
gtgcacgaac cccccgttca gcccgaccgc 
tccaacccgg taagacacga cttatcgcca 
agagcgaggt atgtaggcgg tgctacagag 
actagaagaa cagtatttgg tatctgcgct 
gttggtagct cttgatccgg caaacaaacc 
aagcagcaga ttacgcgcag aaaaaaagga 
gggtctgacg ctcagtggaa cgaaaactca 
aaaaggatct tcacctagat ccttttaaat 
atatatgagt aaacttggtc tgacagttac 
gcgatctgtc tatttcgttc atccatagtt 
atacgggagg gcttaccatc cggccccagb 
ccggctcctg atttatcagc aataaaccag 
cctgcaactt tatccgcctc catccagtct 
agttcgccag ttaatagttt tcgcaacgtt 
cgctcgtcgt ttggtatggc ttcattcagc 
tgatccccca tgttgtgcaa aaaagcggtt 
agtaagttgg ccgcagtgtt atcactcatg 
gtcatgccat ccgtaagatg cttttctgtg 
taataccgcg ccacatagca gaactttaaa 
gcgaaaactc tcaaggatct taccgctgtt 
acccaagtga tcttctgcat cttttacttt 
aaggcaaaat gccgcaaaaa agggaataag 
tttccttttt caatattatt gaagcattta 
atttg 



gcgggttttc gctatttatg aaaattttcc 28500 
taacttaatg tttttattta aaataccctc 28560 
gcgaggcttt ttggcctctg tcgtttcctt 28620 
gaagttaacg gatccaggcc gcgagcaaaa 28680 
gccgcgttgc tggcgttttt ccataggctc 28740 
cgctcaagtc agaggtggcg aaacccgaca 28800 
ggaagctccc tcgtgcgctc tcctgttccg 28860 
tttctccctt cgggaagcgt ggcgctttct 28920 
gtgtaggtcg ttcgctccaa gctgggctgt 28980 
tgcgccttat ccggtaacta tcgtcttgag 29040 
ctggcagcag ccactggtaa caggattagc 29100 
ttcttgaagt ggtggcctaa ctacggctac 29160 
ctgccaaagc cagttacctt cggaaaaaga 29220 
accgctggta gcggtggttt ttttgtttgc 29280 
tctcaagaag atcctttgat cttttctacg 29340 
cgttaaggga ttttggtcat cagattatca 29400 
taaaaatgaa gttttaaatc aatctaaagt 29460 
caatgcttaa tcagtgaggc acctatctca 29520 
gcctgactcc ccgtagtgta gataactacg 29580 
gctgcaatga taccgcgtga cccacgctca 29640 
ccagccggaa gtgccgagcg cagaagtggt 29700 
attagttgtt gccgggaagc tagagtaagt 29760 
gttgccattg ctacaggcat cgtggtgtca 29820 
tccggttccc aacgatcaag gcgagttaca 29880 
agctccttcg gtcctccgat agttgtcaga 29940 
gttatggcag cactgcataa ttctcttact 30000 
actggtgagt attcaaccaa gaatacggga 30060 
agtgctcatc attgggaaac gttcttcggg 30120 
gagatccagt tcgatgtaac ccactcgcgc 30180 
caccagcgtt tctgggtgag caaaaacagg 30240 
ggcgacacgg aaatgttgaa tactcatact 30300 
tcagggttat tgtctcatca gcggatacat 30360 

30365 



<210> 5 

<211> 33 

<212> DNA 

<213> Homo sapiens 

<400> 5 

gcggaattcg gcttggtgac ttagagaaca gag 33 

<210> 6 

<211> 33 

<212> DNA 

<213> Homo sapiens 

<400> 6 

gcgggatcct tgaacccgga ccctctcaca eta 33 

<210> 7 
<211> 20 
<212> DNA 

<213> Artificial Sequence 
<220> 
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<223> derived from Adenovirus 
<400> 7 

actctcttcc gcatcgctgt 

<210> 8 
<211> 21 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> derived from Adenovirus 
<400> 8 

cttgcgactg tgactggtta g 

<210> 9 
<211> 20 
<212> DNA 

<213> Ariririciai Sequence 
<220> 

<223> derived from Adenovirus 
<400> 9 

ccgcacccac tatcttcata 

<210> 10 
<211> 20 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> derived from Adenovirus 
<400> 10 

ggtgtccaaa ggttcggaga 
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